





VOLUME XXIV NEW SERIES, NO. 168 


JOURNAL 


OF THE 


AMERICAN STATISTICAL 
ASSOCIATION 


DECEMBER, 1929 


CONTENTS 
THE ANALYSIS OF FREQUENCY DISTRIBUTIONS. By G. R. Davies 
A SECOND CATEGORY OF LIMITATIONS IN THE APPLICABILITY 
OF THE CONTINGENCY COEFFICIENT. By J. Artaour Harris 
and Cui Tu 


HORSEPOWER STATISTICS FOR MANUFACTURES. By WILLARD 
.. THORP 


A SIMPLIFIED METHOD OF GRAPHIC CURVILINEAR CORRELA- 
TION. By L. H. Bean. 


THE NEED FOR AN INDEX FOR SOCIAL DATA. By Mary Jounston 


DETERMINATION OF A PRECISE INDICATION OF CHANGE IN 
CROP ACREAGE, By A. J. BeYLEvELD 


COTTON FUTURES AS FORECASTERS OF COTTON SPOT PRICES. 
By Forrest Bee AsHBy. re 
NOTES: 
Tue Rexative Importance or CHECK AND CasH PAYMENTS. By ARTHUR 
F. Burns. 


Tue New Trenp 1x Distripution (425); ProGress oF THE Census (427); 
MiscetLaANgovus Nores (431); MemBers AppeEp (435). 


REVIEWS 


PUBLISHED QUARTERLY BY THE 
AMERICAN STATISTICAL ASSOCIATION 
PUBLICATION OFFICE: Rumrorp Press, Concorp, N. H. 
EDITORIAL OFFICE: Cotumpia Untversrry, New York Cirr 


Price $1.50 per copy $6.00 per annum 




















THE EMPLOYMENT CLEARING HOUSE 


OF THE 
AMERICAN STATISTICAL ASSOCIATION 


will aid you with your employment problem. It makes no charge to employers for its 
services. Any member of the Association may register for a position. There is no fee, 


The following positions were open in November, 1929: 











Pay 





Field $2500- $5000 
3600 and over 


M/F M 

















Agricultural Economics 
Business-Mfg. and Selling 
Economic and Industrial Research... 


Governmental 

Social Research 
Statistical Secretarial 
Teaching 









































Members were looking for openings as listed: 





Pay 











Agricultural Economics 
Business-Mfg. and Selling 

Drafting 

Economic and Industrial Research. . 


Insurance 
Marketing 
Mathematics 
Social Research 





Vital Statistics 
































COMMUNICATE YOUR NEEDS TO 
MRS. ROSE L. WOODBURY 


Secretary’s Assistant, American Statistical Association 
Room 530, Commerce Bldg., 236 Wooster St., New York City 


(NOT A PAID ADVERTISEMENT] 





iat 6.0 OS ot 


aia 


ee eo ee Sa es 





NEW SERIES, NO. 168 (VOL. XXIV) DECEMBER, 1929 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Formerly the Quarterly Publication of the American Statistical Association 








CONTENTS 
THE ANALYSIS OF FREQUENCY DISTRIBUTIONS. By G. R. Davies 


A SECOND CATEGORY OF LIMITATIONS IN THE APPLICABILITY 
OF THE CONTINGENCY COEFFICIENT. By J. Arraur Harris 


HORSEPOWER STATISTICS FOR MANUFACTURES. By Wuarp L. 


A SIMPLIFIED METHOD OF GRAPHIC CURVILINEAR CORRELA- 
i cei veehieneaisdesusdsusuceeeseenteetsens 


THE NEED FOR AN INDEX FOR SOCIAL DATA. By Mary Jounston 


DETERMINATION OF A PRECISE INDICATION OF CHANGE IN 
CROP ACREAGE. By A. J. BEYLEVELD 


COTTON FUTURES AS FORECASTERS OF COTTON SPOT PRICES. 
By Forrest Bee AsHBY 


NOTES: 


THe RELATIVE ImpoRTANCE OF CHECK AND CasH PAYMENTs. By ARTHUR 


Tue New TREND IN DisTRIBUTION (425); PRoGREsS OF THE CENSUS (427); 
MISCELLANEOUS Nores (431); MemMBERS ADDED (435). 


REVIEWS: Warren and Pearson: Interrelationships of Supply and Price, by 
H. Working (437); King: Trends in Philanthropy, by R. G. Hurlin (442); 
Sloan: Corporation Profits, by G.O. May (443); Senate Document 46: Supply 
of Electrical Equipment and Competitive Conditions, by E. E. Lincoln (444); 
Mouzon: The Determination of Secular Trends, by Simon Kuznets (445); 
Gebhart: Funeral Costs, by Niles Carpenter (446); Montgomery: The 
Coéperative Pattern in Cotton, by M. T. Copeland (447); Herzog: The 
Morris Plan of Industrial Banking, by J. G. Rolph (449); Kock: A Study of 
Interest Rates, by K. Simpson (449); Adams, Lewis and McCrosky: Popula- 
tion, Land Values and Government, by G. B. L. Arner (450); March: 
Demographie, by E. W. Kopf (452); Tooke and Newmarch: A History of 
Prices and of the State of the Circulation from 1792 to 1856 (Revised Edition), 
by F.C. Mills (453) ; Reilly: Marketing Investigations, by F. E. Clark. (454). 





Published Quarterly by the American Statistical Association 
Rumrorp Press Burtp1nG, Concorp, N. H. 
Editorial Office, Columbia University, New York City 
Entered at the post-office, Concord, N. H., as second-class mail matter 


Acceptance for mailing at the special rate of postage provided for in Section No. 1103, Act of 
October 3, 1917, authorized April 29, 1922 










































Officers of the American Statistical Association 


ORGANIZED NOVEMBER 27, 1839 





PRESIDENT Edwin B. Wilson 

VicE-PRESIDENTS Francis Walker Ralph G. Hurlin 
Mordecai Ezekiel 

COUNSELLORS F. Leslie Hayford R. H. Coats 


Robert E. Chaddock 


SECRETARY-TREASURER Willford I. King 
Room 530, Commerce Building, 
New York University 
236 Wooster Street, New York City 


CHAPTER AND DistTRIcT SECRETARY 


Austin (Texas) CHAPTER 
Carroll D. Simmons, University of Texas, Austin, Texas 
Boston CHAPTER 
Roswell F. Phelps, Massachusetts Department of Labor and Industries, 
Boston, Mass. 
CuicaGo CHAPTER 
Robert B. King, Illinois Bell Telephone Co., 212 Washington Street, 
Chicago, Ill. 
CLEVELAND CHAPTER 
D. C. Elliott, Midland Bank, Cleveland, Ohio 
Co._umBus (OHIO) CHAPTER 
Harry S. Will, 66 Fallis Road, Columbus, Ohio 
Los ANGELES CHAPTER 
Ira N. Frisbee, 33114 North Beverly Drive, Beverly Hills, Calif. 
San Francisco CHAPTER 
Oliver P. Wheeler, Federal Reserve Bank, San Francisco, Calif. 


AREA REPRESENTED AND DIstTRICT SECRETARY 
DETROIT AND VICINITY 
Lester K. Kirk, Standard Accident Insurance Company, Detroit, Mich. 
PITTSBURGH AND VICINITY 
George A. Doyle, Bell Telephone Co., Pittsburgh, Pa. 
WASHINGTON AND VICINITY 
Thomas B. Rhodes, Federal Reserve Board, Washington, D. C. 


EpITOR Frank Alexander Ross 
405 Fayerweather Hall, 
Columbia University, New York City 


Review EpDIToR Leo Wolman 
30 Fifth Avenue, New York City 
AssociaTE EDITORS Edmund E. Day Wesley C. Mitchell 


William F. Ogburn Walter F. Willcox 








— ry 


nm + 









































lin 











NEW SERIES, NO. 168 DECEMBER, 1929 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Formerly the Quarterly Publication of the American Statistical Association 


(VOL. XXIV) 











THE ANALYSIS OF FREQUENCY DISTRIBUTIONS 


By G. R. Davirs, University of Iowa 


It is generally recognized that the process of tabulating data in 
classes may give rise to a considerable degree of distortion of the 
original data. Even when the class limits are chosen with the utmost 
care, so as to secure an adequate scatter in each class, the measures of 
central tendency, dispersion, and type of distribution may be none too 
dependable. Nevertheless, frequency classifications are indispensable, 
both as a means of summing up compactly a large array of data, and as 
a form of generalization. Very often, too, it happens that the statisti- 
cian must work with tabulations made up from original sources which 
he cannot conveniently consult. Hence the analysis of frequency 
distributions is likely to remain one of the important branches of 
statistics. 

If one consults textbooks regarding the methods of such analysis, he 
will find on the one hand some rather crude methods of computing a 
mean, interpolating a mode, a median, or other percentile, and measur- 
ing dispersion and skewness, with perhaps only the most casual refer- 
ence to the determination of the type of curve. On the other hand, in 
the advanced texts, one will find a theoretical correction applicable in 
certain cases to one of the errors of tabulation, with a neglect of 
similar errors elsewhere, and a brief description of the elaborate analysis 
of types devised by Pearson, with perhaps a reference to the Charlier 
theory. Otherwise, the methods of computation will be in the main 
the approximations presented in the elementary texts. 

Recently there have appeared, however, certain refinements of com- 
putation indicating a recognition of the need for methods of analysis a 
little more precise than those of the elementary textbook, and yet not 
requiring the elaborate theory and extended calculations of the Pearson 
system. It appears to the writer that there is much promise for 











350 





American Statistical Association (2 


applied statistics in this tendency, and that it is worthy of somewhat 
careful investigation. It is the purpose of this discussion to mention 
one or two of the contributions in the field indicated, to point out other 
possible refinements, and to suggest further needs. It is not argued 
that such methods should be generally used; in much ordinary work the 
simpler methods now in use suffice. It is merely suggested that meth- 
ods of greater accuracy should be available for those cases which 
demand it. 

The primary inaccuracy in calculations based on frequency tabula- 
tions is that arising from the use of the class mark (m, the mid-point of 
the class) as the magnitude of each item included in the class fre- 
quency (f). It is obvious that expressions like fm do not adequately 
represent the class because of the tendency for frequencies to increase 
toward the mode. Expressions like fm? are subject to the further 
error arising from the fact that the arithmetic mean and the quadratic 
mean of a series of numbers are not identical, the latter being in effect 
weighted toward the larger numbers. 

The errors arising from the use of the class mark as representative of 
the class are not very great in calculating such types as the arithmetic 
and geometric means, since the errors on opposite sides of the mode are 
in large part compensating. But in measures of dispersion the aggre- 
gate error may be considerable. The well-known Sheppard’s correc- 
tion precisely removes this error from tabulations that are entirely 
normal, but unfortunately it may make matters worse when applied to 
irregular distributions not having close contact. Hence some means 
of estimating the error in ordinary distributions is desirable. 

There are two rather distinct methods of approach to such a problem. 
The one assumes the given frequencies as fixed, and seeks merely to 
minimize the error of tabulation in each class as it stands; the second 
assumes that the given frequencies represent an approximation to a 
type, which may be discovered by a smoothing or curve-fitting process. 
These two general methods will be considered in the order named. 


I 


As a step toward greater accuracy in dealing with frequency classes 
taken as fixed, Professor King has developed a simple method for 
obtaining the correct value of fm on the assumption that the slope of the 
frequency curve at the position of a given class may be taken as uni- 
form throughout the class. In a similar way a correction for fm? may 
be obtained. Denoting the class mark corrected for the purpose at 
hand as m, and the slope of the curve throughout the class as 6 per 
unit of the class interval (J), we have: 








3] 


If, | 
the! 
deri 
whi 
But 
infil 

I 
prié 
may 
frec 
inte 


in 1 
foll 
the 
cur 
mu 
sho 
free 
giv 
tot: 
reg 
im) 
des 


tio: 
cal 
tio 
an 
fro 
dia 
ney 
the 
Th 
cla 
cla 











The Analysis of Frequency Distributions 


fm. =fm+b(I?/12) 
and Xfm. =Zfm+(zb) (J?/12) 
fm?.=fm?+(f+2 bm) (I?/12) 
and Yfm?,.=Zfm?+(n+2zbm) (17/12). 


If, however, deviations (d) from a median or other origin are taken, 
then d is substituted for m. These formulas may be most easily 
derived on the assumption of a certain number of sub-classes (s), in 
which case the factor (s?—1)/s? is a part of the last term of each. 
But this factor may be dropped when the sub-classes are thought of as 
infinite in number. 

In applying the formulas there may be some question of the appro- 
priate slope to use. Except in very irregular distributions, the slope 
may most simply be determined as a first approximation by the usual 
frequency polygon. The polygon shows two slopes within each class 
interval, and these averaged become, 


b=(fo—fi)/2I 

in which f; and fe represent the frequencies respectively preceding and 
following the given frequency on the magnitude scale. It is true that 
these slopes, if actually constructed, do not give a well smoothed 
curve. Also, cases of irregular distributions may occur where a slope 
must be arbitrarily determined. In any case, this first appreximation 
should be limited to the magnitude b = + 2f/J in order to avoid negative 
frequencies. But since the slope thus determined normally puts each 
given class in line with the adjacent frequencies and does not change the 
total frequency itself, it constitutes a much more probable assumption 
regarding the distribution of the frequencies within the class than is 
implied in the usual rectangular frequency. And a double slope, to be 
described later, may be applied as a second approximation. 

Before illustrating the application of the formulas just stated, men- 
tion may be made of a further error which enters into the ordinary 
calculation of the average deviation. Professor Rietz has called atten- 
tion to this error in Chapter II of Handbook of Mathematical Statistics, 
and has there stated a formula for eliminating it. The error arises 
from the fact that the deviations within the class containing the me- 
dian, or other point of origin (R), are in part positive and in part 
negative. The usual method of writing the deviations (d) asd=m—R 
therefore counts only the algebraic total of the deviations in this class. 
Thus if R nearly coincides with m, practically all the deviations of the 
class of origin are excluded. Professor Rietz’ formula, adjusted to any 
class interval (I), gives the total deviations corrected (fd.) of the class 
of origin, as: 



























American Statistical Association 


fd.=f(.25I+d42/D), 


in which d=m—R. If, however, the class is given a slope, as pre- 
viously described, then this formula expands to the form: 


fde=f(.25I +d?/I) +bd(.251 — 2/31) 
or fde-=fI/4+bdI /4+fd?/I —bd?/3I. 


In order to derive the foregoing expansion as well as to obtain the 
median and other percentiles on the basis of the sloped frequencies, it 
is necessary to obtain a formula for subdividing the frequency of the 
class of origin at the point of origin (R) into two consecutive sub- 
frequencies (g and f—g). This formula may be obtained by inte- 
grating the area up to the given point, as follows. If x equals the 
distance from the lower limit of the class to the origin (c=R—L),), b 
equals the slope per unit of the interval (bJ =f./2—f,/2), and h equals 
the height of the sloped frequency at the lower limit of the class (h= 
f—bI/2), then the area (g) under the line, h+bz, from the lower limit 
of the class to the origin is, hx+ba?/2, and the required area (g) in 
units of class intervals is g = (hx+.5 bx?) /T. 

It is more convenient, however, to write this formula in terms of the 
class mark (m), the frequency (f), the class interval (J), and the slope 
(b) per unit of the class interval, and the deviation (d) of the class 
mark from the origin (d=m—R) thus: g=f/2—bI/8—fd/I+bd?/2I 
which, when b=0, reduces to g=f (.5—d/I). 

The second sub-frequency (f—g) is readily obtained by subtraction, 
or by the formula for g with the signs of all terms except the first 
reversed. 

In order to locate any percentile (P) on the X-scale, we may con- 
sider this percentile as the point of origin, so that d=m—P or P= 
m—d. By solving the preceding equations for d we have: 


P=m—f/b=(f?/b? +291 /b—fI/b+I?/4)? 
or! P=m—f/b+[(f/b—I/2)?+2g] /b}' 


in which the sign of. the radical follows that of b; or, if b=0, 
P=m-—I/2+gI1/f 


the latter being merely the ordinary interpolation for a percentile, as 
the expression m—I/2 is obviously the lower limit of the class. These 
formulas, applied to the class in which the percentile falls, will give its 
magnitude. 


1 In the notation previously used this becomes 


A? . 21g 
P=L,—h/b+ (5+72)3. 











(I). 
ordi! 
emp! 








The Analysis of Frequency Distributions 353 


TABLE I 


COMPUTATION OF THE AVERAGE DEVIATION (AD.) ON THE BASIS OF EACH 
FREQUENCY SLOPED TOWARD THE MODE, AS DETERMINED BY THE 
PRECEDING AND SUCCEEDING FREQUENCIES (fi and f) 


The slope (b) applied to each frequency is expressed per unit of magnitude within the class interval 
(I). The median (Md) is taken as the point of origin of the deviations (d). It is interpolated by the 
ordinate in the origin class at one-half n less the sum of preceding frequencies (g=5—2). The formulas 
employed are given at the foot of the Table. 





m f d fa +b (I2/12) 








(m-Ma) 
22 2 —5.0454 + 10.0908 * 4/8 X 4/3 
26 4 —1.0454 (4.9741, see below) 

30 3 2.9546 8.8637 — 3/8 X 4/3 
34 1 6.9546 6.9546 — 3/8 x 4/3 
n=10 30.8832 —10/8 X 4/3 

— 1.6667 

Lfde = 29.2165 


AD. =Zfde +n =29.2 +10 =2.92 
Md=m—f/b +[(f/b —I/2)?+2g] /b]4 
=26—32 +(900 +192) =27 .0454 


(Limit when b}=0: Md=m-—I/2+9I/f) 
Slope =b =(f2 —fi)/2I 

(Origin class) fde =fI /4+bdI /4 +fd?/I —bd*/3I 
=4 —.1307 +1.0929 +.0119 =4.9741 














We may illustrate the use of these formulas by applying them to a 
simple distribution as in Tables I and II. In the former the average 
deviation (AD-,) corrected for the assumed slope of each frequency, is 
found. The slope, as previously indicated, is the average slope of the 
frequency polygon within the given class, which is determined by the 
slope from the preceding to the succeeding frequency (f; to fe). The 
median (Md) is first computed as indicated at the foot of the table and 
the deviations (d) of the class marks are taken from the median as 
origin. The corrected deviations (fd-) are obtained as fd+b/*/12, 
except that the aggregate deviations taken as positive are obtained for 
the class of origin by the special formula previously mentioned, as 
shown at the foot of the table. In totaling the deviations, the signs of 
the negative deviation and its correction term are reversed as indicated. 
It is convenient to sum the b’s, and multiply the total by [2/12 before 
combining these correction terms with the sum of the fd column. The 
resulting average deviation, 2.92, is somewhat smaller than that ob- 
tained by the usual method, which is 3.00. 

The calculation of the standard deviation in Table II follows the 
usual short-cut method. The second class mark, 26, is assumed to be 
the average, and the deviations are taken from it. The fd and fd? 
columns are corrected in accordance with the formulas previously 
discussed. Since algebraic sums are required, special formulas are not 

























354 American Statistical Association (6 


CHART I 


THE FREQUENCY DISTRIBUTION OF TABLES I AND II, WITH SLOPED FREQUEN. 
CIES; ALSO THE CUMULATIVE CURVE BASED UPON THESE FREQUENCIES, 
WITH THE CORRESPONDING GRAPHIC INTERPOLATION OF THE 
QUARTILES (Qi, Q@ AND Q;)* 





/0 








a 





























0 
20 24 
CLASS LIMITS 


*The equation of the cumulative curve is, ¥ =C+hz+bIz?/2, where C is the cumulative at the 
lower limit of a given class; h is the height of the sloped frequency at the same point, to be applied 
within the given class; and z 1s a point on the given class interval (I), where the class interval is taken as 
extending from zero to one. The cumulative curve is a parabola determined within each class by the 
integration of the sloped frequency. The light dotted line is a second approximation, smoothing the 
first sloped frequencies, and doubling the number of classes, as described later. (See Table III.) 


needed for the class of origin. The standard deviation is obtained from 
the corrected sums of deviations and squared deviations in accordance 
with the usual method. The result is ¢=3.55, as contrasted with 3.60 
by the usual method, and 3.41 by the use of Sheppard’s correction. 





7] 


Th 
hat 
sho 
sin 
the 
Wi 
thi 


Sl 


indi 


A; 


mul 


sol 
de: 
me 
the 














7] The Analysis of Frequency Distributions 355 


The last result is undoubtedly too small, as the distribution does not 
have close contact. In fact the method as here illustrated may be 
shown to be Sheppard’s correction adjusted for lack of close contact, 
since if corrections are included for zero frequencies at the extremes of 
the distribution, 2bd=—n, and we have Sheppard’s correction. 
With regular slopes the problem might be profitably approached from 
this angle. 
TABLE II 


COMPUTATION OF ARITHMETIC MEAN (A) AND STANDARD DEVIATION (¢), 
ASSUMING FREQUENCIES SLOPED TOWARD THE MODE 


Slope (b) taken as in Table I. Usual short-cut method, with corrections for sloped frequencies as 
indicated. 


fd? +(f+2bd)(I?/12) =fd*, 











d fd + 6 (I*/12) = fad. 











m f 
22 2 —+ —8 + 4/8 X 4/3 32 2-—4 X 4/3 
A; = 26 4 0 0 + 1/8 X 4/3 0 4-0 X 4/3 
30 3 4 12 — 3/8 X 4/3 48 3-3 X 4/3 
34 1 8 8 — 3/8 X 4/3 64 1-6 X 4/3 
n=10 12 - 1/8 X 4/3= 
11.833 144 —3 X 4/3=140 
c =Lfde/n =1.1833 Lfd*./n =14 
o2 =Lfde 2/n —c? = 14 — 1.4002 = 12.5998 
o =12.59984 =3.55 
A=Az+fde/n =26 +1.1833 =27.1833 
Note.—The above process may be abbreviated by using unit class intervals, thus eliminating the 


factor ] from the formulas. That is, write the d column: —1, 0,1,2. The final standard deviation 
then is J times o as computed; and A=A:z+IJc. If the slopes are regular according to the formula, 
2b/12=(last f—first f) /24, and the 2bd column may be written as d(f,—fi). In any case the maxi- 
mum slope is 2f/I, or 2f when unit class intervals are used. 

Many variations of the principle of sloped frequencies are possible, 
some of which would be desirable in special cases, but which cannot be 
described here. There is, however, a second approximation to the 
method as just described which is worth notice. This method requires 
the subdivision of each frequency into two sub-frequencies of equal 
class intervals. The sloped frequencies are normally adjusted to meet 
at a common point on the ordinates of the original class limits, this 
point being an average of the two points previously found. This 
point (Jk) may, however, be conveniently found by a formula involving 
the four adjacent frequencies. The height (fh) of the frequency 
polygon at the new class limits (the original class marks) is then cal- 
culated so as to retain the original area, with n doubled and J halved. 
As applied to the preceding data, the method is illustrated in Table ITI. 
The resulting frequency polygon is shown by the light dotted line in 
Chart I, and also appears in Chart II. The method is particularly 
useful in obtaining a close approximation of the quartiles. The types 
and measures, calculated from this polygon by the method of Tables 
I and II, are as follows: 























American Statistical Association 


A =27.183 Q,;=24.570 
o = 3.487 Q2=27.019 
AD= 2.868 Q;=29.675. 


Turning next to the problem of determining the mode, we may note 
the method of interpolation by reference to the modal and two adjacent 
frequencies. Two approximations are commonly given; the first of 
which subdivides the modal class interval in inverse proportion to the 
adjacent frequencies. The other subdivides the modal class interval 
in inverse proportion to the differences of the modal frequency less the 


TABLE III 


DATA OF TABLE I REDUCED TO A SMOOTHED FREQUENCY POLYGON OF TWICE 
THE ORIGINAL NUMBER OF CLASSES, WITHOUT CHANGE OF AREA 
WITHIN EACH ORIGINAL CLASS 


The height (lh) of the frequency polygon at the original class limits is, lh =[5(fe +f) —(fit+h)]+8, 
where fi .. fs are the consecutive frequencies centering at the given class limit. The height (fh) at the 
original class marks is, fh =[4f—(hi+h2)|+2, where f is the class frequency, and jy and hz are the 
heights of the polygon at the given class limits, respectively. The new frequencies (f,) are the averages 
of two adjacent heights; and b is the slope per unit of the reduced class interval (I) from A; to hz; i.e., 
b =(he —hy) +I. 








Zz f lh fh te b 
20 75 

21 1.34375 59375 
22 2 1.9375 

23 2.65625 .71875 
24 3.375 

25 3.84375 .46875 
26 4 4.3125 

27 4.15625 — .15625 
28 4.0 

29 3.5 —.5 

30 3 3.0 

31 2.5 —.5 

32 2.0 

33 1.4375 — .5625 
34 1 875 

35 5625 — .3125 
36 25 








Nore.—In using the above formulas, zero frequencies are understood at the extremes of the table. 
Sometimes fh in one of the extreme frequencies may prove to be less than an adjacent lA, in which case 
the slope as calculated in Table I may be taken as the basis of sub-division, and two lh points will appear 
on the same ordinate. Other irregularities may also require special adaptations of the method. 


two adjacent frequencies. The latter method is expressed by Pro- 
fessor Rietz as follows: 


Mo=L1,—I1(Ai+ As) 
the first and second differences thus indicated referring to the first of 


the three frequencies centering at the mode, and the L; indicating the 
lower limit of the modal class. 











Its 
wh 
of | 
me 
lat 
the 
the 
est 
the 
Sil 
rat 
még 


$0] 


dis 


fit 











9] The Analysis of Frequency Distributions 357 


Of these two methods there is no doubt that the latter is preferable. 
Its advantage may be visualized if we consider the shifting of the mode 
when one of the frequencies adjacent to the modal frequency is thought 
of as increasing until it becomes the modal frequency. By the former 
method the mode under such conditions may change abruptly; by the 
latter, it shifts smoothly from one class to the other, falling exactly at 
the common class limit when the variable frequency becomes equal to 
the modal frequency. But its superiority may be more definitely 
established by proving that the mode thus determined is the mode of 
the quadratic parabola fitted to the three frequencies in question.' 
Since the logarithms of a normal frequency curve constitute a quad- 
ratic parabola, concave downward, the method is in fact a close approxi- 
mation to a limited curve fitting, not allowing, of course, for skewness. 

A method may be suggested for determining the mode (Mo) on a 
somewhat wider base with a rather precise allowance for the degree of 
skewness. Since the method rests on the theory of curve fitting to be 
discussed later, the rule only will here be given. The rule is as follows: 


1. Determine the quartiles (Q:, Q2, and Q;), preferably by the method 
of the sloped frequencies already described. 

2. Find c=(Q%—Q:1Q3)/(Q:1+Q3—2Q2), and add c to each of the 
quartiles. They should now constitute a geometric progression. 

3. Find log (Q2+c). Also find log o, =[log (Qs;+c)—log (Qi+c)] X 
7413. 

4. Solve for Mo the equation, 


log (Mo+c) =log (Q2+c) — (log a,)?/log e 
or Mo+c=(Q2+c) +antilog [(log o,)?/.4343]. 


This procedure determines the mode of a logarithmic normal curve 
fitted to the data by a method which adjusts it to the degree of skew- 
ness indicated by the quartiles. The implied curve fitting may be 
considered, however, as merely a smoothing of the central part of the 
curve from Q, to Q3, rather than an exact determination of curve type. 
It will give good results with ordinary normal or skewed distributions. 
If the skewness is negative, the quartiles when corrected by adding c 
will be negative, but the indicated logs may be taken as of positive 


' Referring to the three frequencies centered about the mode and taking the X-scale in units of class 
intervals originating at the first class mark we have, as the mode of a quadratic parabola passed through 
the three frequencies as ordinates at their respective class marks: 

X =1/2—A: +A: (by Newton's formula) 
adapted to the X-scale of the given distribution this becomes: 
Mo =1lL,—I(Ai +Az). 
In practice the formula may be conveniently written as: 
Mo =I, +I d:/(di +d) 
in which d; and dg are the positive differences between the modal frequency and the preceding and 
succeeding frequencies, respectively. 















































358 





American Statistical Association [10 


numbers, though in theory Q; and Q; are then interchanged. If c is 
infinite, the distribution is normal, and the mode lies at the second 
quartile. 

Applied to the distribution previously used, as smoothed by the 
second approximation, the rule may be illustrated as follows! (the 
calculation was carried out to more places than indicated): 


Quartiles Q+c log (Q+c) 
1, 24.570 28.848 1.4601 
2. 27.019 31.297 1.4955 
3. 29.675 33.953 1.5309 


c= (Q%—Q:Q3) / (Q:+Q;—2Q*) = (730.0158—729.1265) / .2079= 
4.278 

log o,= [log (Qs+c) — log (Qi+c)] X .7413 =.05246 

log (Mo+c) = log (Q2+c)—(log o,)? / log e=1.4955—.00275 / 
.4343 = 1.48917 

Mo+c=30.844, and Mo=26.566 (see Chart II) 


CHART II 


LOGARITHMIC NORMAL CURVE, FITTED TO THE DATA OF TABLES I AND II, As 
SMOOTHED BY THE METHOD OF SLOPED FREQUENCIES, SECOND 
APPROXIMATION (SEE TABLE III) 


Quartiles interpolated by the method applied to the median in Table I. The area of the curve is 
computed to equal the area of the data in each of the sections cut off by the quartiles. 


1 an 7 























| 
| 
j= - 
| | 
0 | 1 
z Pia] f 
S | | 
S 
j— | _ 
c | | 
” 1, . 1, | | 
20 24 28 32 36 


CLASS LIMITS 


The mode thus determined is in some respects comparable to Pear- 
son’s determination of the mode by the use of Chi as indicating the 
relation between the arithmetic mean (A) and the mode (Mo) thus: 


1 The formula for c is obtained by placing the quartiles plus c in geometric progression and solving for 
c. The formula for log ¢; is equivalent to dividing the quartile range on the log scale by 2 X .67449, 
thus reducing it to a standard deviation on the same scale, which becomes a ratio (7) on the natural 
scale. The formula for the mode is adapted from a study by the author in this JourNaL, December, 
1925, from which study the other formulas here used relating to the logarithmic curve of distribution 
are derived. 








11 


tic 
co 


Be 


of 

the 
the 
ore 


Sir 


ch: 
dis 


cel 
hei 
rev 
ser 
mi 
the 
ser 
ure 
anc 


of f 


bas 
suc 
in « 
reg 
geo 
tur 
mo: 
me! 
Beh 
pric 
and 


as Te 











11] The Analysis of Frequency Distributions 359 
Mo=A-—o®;}(82+3) /(1082—128;—18) =26.291 


which is not well adapted, however, to logarithmic distributions, par- 
ticularly when the skewness is considerable. In the distributions 
commonly met with in economic statistics, the method described gives 
better results, as far as the writer’s experience goes. The use of the 
Betas in the analysis of frequency curves will be discussed later. 


IT 


We may now turn to methods which use the data merely as the basis 
of determining an appropriate generalized curve, and that may derive 
the various required measures and criteria from this curve. Outside of 
the admirable Pearson and Charlier systems of curves, we have only the 
ordinary Gaussian normal and its modification, the logarithmic normal. 
Since the latter can be varied to give any required degree of skewness, 
it is applicable to all distributions involving something like binomial 
chance, and therefore has a wide range of usefulness. The present 
discussion will be confined to this type of curve.' 

In an article in this JourNAL for December, 1925, the writer discussed 
certain methods of fitting a logarithmic normal curve, and the formulas 
here used are based in large part on that discussion, though with a 
revised technique. Such curves were fitted, as the mode just de- 
scribed, by the use of the geometric mean (or median) and the logarith- 
mic standard deviation, these measures being calculated directly from 
the data. This method of curve fitting will not, however, entirely 
serve the purpose at hand, since the calculation of the two basic meas- 
ures is disturbed by irregularities in the extremes of the distribution 
and besides does not conveniently permit an adjustment for the degree 
of skewness. 

Considerable experimentation was carried on with curve fitting 
based on a calculation of the Betas. But it soon became very clear that 
such a method is ill-adapted to the rather irregular dispersions met with 
in ordinary statistical practice, however desirable it may be for more 
regular distributions. If, as just noted, the second moment (cf. the 
geometric mean and the logarithmic standard deviation) is unduly dis- 
turbed by irregularities in the frequencies at the extremes, how much 
more sensitive to such irregularities will be the third and fourth mo- 
ments! Professor Mills has amply shown in his recent study of The 
Behavior of Prices that the Betas appear to classify a distribution of 
price relatives anywhere from Pearson’s Type 3 through Types 6, 5, 
and 4. Yet we do not conclude that such a distribution, where large 


1 A priori, the logarithmic normal might be anticipated for ratios which may be logically also written 
as reciprocals, since the curve of the ratios is then in harmony with the curve of the reciprocals. 


























360 American Statistical Association [12 


irregularities arise from common factors affecting groups of commodi- 
ties, really varies so widely in its real nature; particularly when both a 
priori, and actually in approaching regularity, it has the characteristics 
of the logarithmic normal type. Hence the attempt to analyze 
distributions by the use of the Betas was abandoned. 

At the opposite extreme from the analysis by use of the Betas is an 
analysis based on the median and the quartile dispersion. Such an 
analysis is extremely insensitive to variations at the upper and lower 
limits of the distribution. It is, of course, sensitive to variations at the 
quartile points, but at these points the frequencies are normally large 
and irregularities are consequently minimized. And, since the meas- 
urement rests on all three quartile points, it should be fairly representa- 
tive as applied to the familiar skewed distributions approximating the 
logarithmic form. Hence the quartile basis of analysis was adopted. 

Analysis by such curve fitting falls into two sub-classes. The first 
includes those dispersions which are not regarded as strictly of the 
logarithmic type in that they lack a logical point of origin, and yet 
which may presumably be smoothed by a logarithmic curve of the 
requisite degree of skewness. The second sub-class includes those 
cases which have a definite point of origin usually at the zero point of 
the magnitude scale, and which by inference and by experience we may 
assume to be of the type in question. Curves fitted to dispersions 
falling in the first group, if they give a close fit, may sometimes be 
taken as the basis of measures such as the arithmetic mean, the stand- 
ard deviation, and the Betas—a procedure which may quite generally 
be followed in connection with dispersions falling in the second group. 
Both groups will be illustrated in the order named. 

As an illustration of the first group the data already cited may be 
employed on the assumption that the frequencies belong to magnitudes 
having no sharply defined point of origin; as, for example, income 
classes, where some negative incomes may occur. The process of 
curve fitting has already been in part covered by the calculation in 
connection with the mode. In that calculation the quartiles were 
determined, a correction shifting them to a geometric series was added, 
and the logarithmic standard deviation (log o,) was obtained. The 
further analysis proceeds on the assumption that the entire distribution 
is shifted with the quartiles to the position on the magnitude scale 
indicated by the correction c; that is, the class marks now are considered 
tobem-+c. The median (Q.+c) of the dispersion may now be taken as 
the geometric mean (G) with which in the logarithmic normal curve it 
is theoretically identical. The geometric mean and logarithmic stand- 
ard deviation thus found would not be identical with the analogous 





13] The Analysis of Frequency Distributions 361 


measures which might be computed directly from the logarithms of the 
original class marks and the given frequencies, but they would be iden- 
tical with the results of such a computation applied to the shifted class 
marks (log m-+c) with frequencies smoothed to the logarithmic normal 
form. It is obvious that the curve now to be fitted to the shifted data 
will have the same quartiles as these data, and therefore the same 
skewness as measured by the logarithmic standard deviation or by the 
quartile criterion: 


Sk=(Q:+Q3;—2Q2) +(Q;—Q,). 


It may afterwards be shifted back by subtraction of c from the abscissae 
to the original position on the X-scale, where it becomes a smoothed 
form of the data adjusted to the given skewness. The criterion of fit 
is that the area of the data and of the fitted curve cut off by successive 
quartiles should be respectively equal. It is because it thus constitutes 
a smoothing process that the curve may implicitly be used in the deter- 
mination of the mode as previously illustrated. 

If the curve is to be computed merely for plotting, the most direct 
method of computation is by the use of the equation of the curve. 
This equation written in a form convenient for computation is: ! 

Y =(.17326nI /log o,) +(X antilog [(.21715/log ¢,’) (log X —log G)?}) 
in which the symbols are as previously employed. This formula is 
merely the ordinary formula for the normal curve modified for the 
logarithmic scale, and adjusted to the class interval (J) and the fre- 
quencies (n) of the data. It may also be written in the form: 

Y = (.4343 nI/log o,) 2/X 

where z is the ordinate obtained from a table of the normal curve ef 
unit area at the point. 

x/a=(log X —log G) /log a,. 
If the points on the X-scale chosen for such a computation are the 
class limits, then the normal frequencies as distinguished from the 
ordinates may also be computed by taking the first differences of the 
normal area at the given z/o positions, as is illustrated in Table IV. 

To facilitate charting it is desirable to find the mode (Mo) and the 
height of the curve at the mode (Yy,). The mode was previously 
found to be 30.844, which becomes 26.566 when the curve is shifted 
back to its original position. The height at the mode may be found by 
the equations of the curve, taking X = 30.844, or more conveniently by 
means of the height at the geometric mean (Y¢) as follows: 

Yg=(.17326nI /log o,)/G. 


1 The equation in its original form is given at the foot of Table V. 








362 American Statistical Association (14 


Since z then becomes 1/V2z this expression may be changed to 
indicate the height at the mode by means of a factor which recurs jn 
formulas relating to the logarithmic curve, namely, 


log ¢,/log 
or a=antilog (log a,)?/.4343. 
The height at the mode may now be expressed as 
Y wo = Yq a =4.221 X 1.0073 = 4.252 
and the formula for the mode may be rewritten as 
Mo=G/a. 


TABLE IV 


COMPUTATION OF NORMAL ORDINATES (Y) AND NORMAL FREQUENCIES (FP) AT 
UPPER CLASS LIMITS (124) ADVANCED c=4.2780 UNITS ON MAGNITUDE SCALE 
(X=L2+c) DATA OF PREVIOUS COMPUTATIONS 


Find z/¢ = (log X —log G)/log or; read ordinate (z) and area from table of normal curve of unit area 
take Y =(.4343nI /log o-)z/X; and F as first differences of area from —.5 to .5; assuming G =31.2968 
and log o, =.05246. The two extreme F’s contain small residuals belonging to more extreme frequen- 
cies. The last column shows the deviations of the data from the normal in units of the probable error 
of sampling. The calculations were carried to more decimal places than are here indicated 


og X x/o z Y Area Ai PF 1/PE, 


L2 4 I 

16 20.278 1.3070 —3.5926 .0006 .0101 — .4998 .0002 002 07 
20 24.278 1.3852 —2.1024 0438 .5970 — .4822 .0176 176 63 
24 28.278 1.4514 — .8397 . 2804 3.2837 — .2995 . 1828 1.828 1 
28 32.278 1.5089 2556 3861 3.9611 1009 .4003 4.003 00 
32 36.278 1.5596 1. 2226 1889 1.7245 . 3893 2884 2.884 12 
36 40.278 1.6051 2.0886 .0450 .3703 .4816 0924 924 12 
40 44.278 1.6462 2.8725 .0064 .0482 .4980 0163 .163 60 
44 48.278 1.6837 3.5884 0006 .0043 .4998 0019 020 21 


To find the Mode (Mo) and the ordinate at the mode (¥4,,): 
Let a =antilog [(log o,-)?/.4343] 
Then Mo =G/a =31.297/1.0147 =30.844 
and Mo —c = 26.566 
Y mo =(.17326nI /log or) at/G 
= 132.108 X 1.00732 +31.2968 =42.520 


The normal frequencies (Ff) as found in Table IV, though not so 
convenient for charting, are theoretically more useful than the normal 
ordinates in determining closeness of fit, in that they are more directly 
comparable with the data. If we measure closeness of fit by the prob- | 
able error of sampling based on the normal frequencies, we find that 
the actual frequencies fall well within the probable error of variation 
as might be expected with so small a total sample. Such measures of 
closeness of fit, however, do not have a great deal of significance in 
economic data, for the simple reason that the ‘‘observations”’ are s0 
often not independent in the sense required in theory. For example, 
prices influence each other and are influenced by common factors, and 





WwW 
al 
OI 


th 
by 








15] The Analysis of Frequency Distributions 363 


weekly measurements of a cycle may amount to little more than a 
recopying of the monthly measurements. 

Suppose now we assume that the smoothed curve which we have 
fitted represents a valid generalization of the data, and we wish to 
obtain the arithmetic mean (A), the standard deviation (¢), and the 
Betas (8; and $2) of this smoothed distribution. The writer has 
shown elsewhere that in logarithmic distributions, 


A=Ga' 
which, solved for the data previously used, advanced by c= 4.278 units 
and shifted back by subtraction, after solving, to the original position 
on the magnitude scale gives, 


A = (31.297 X 1.00732) — 4.278 = 27.248 


as compared with 27.2 by the usual method. Also, it is easily deduced 
} aw ‘pe ‘ . »<pe ° ‘ J 
that o=G@ [a(a—1)]? =31.297 X.12212 =3.822 as compared with 3.6 
by the usual method. 
TABLE V 
CRITERIA OF THE LOGARITHMIC NORMAL CURVE, FOR DESIGNATED 
MAGNITUDES OF THE LOGARITHMIC STANDARD DEVIATION (log e-) 
If the curve is taken in terms of log X, instead of X, it becomes normal, and the criteria are as in- 
dicated at log o- =0. 











log or B, B, ky r 

000 .000 3.000 .000 Infinity 
025 .030 3.053 .017 725.772 
05 .121 3.216 069 182.552 
.10 .508 3.917 .309 46.729 
15 1.239 5.280 845 21.600 
.20 2.474 7.699 1.976 12.830 
25 4.523 11.989 4.410 8.798 
30 7.976 19.905 9.883 6.635 
35 14.014 35.468 22.893 5.361 
.40 25.108 68.611 55.898 4.562 
45 46.735 146.083 145.962 4.C43 
.50 91.828 346.871 412.258 3.697 
55 193.074 928.103 1270 .984 3.465 
.60 439.207 2815.778 4307 .933 3.309 


The equation of the logarithmic normal curve of unit area is, 


X\2 /, ; 
Y =G+[X(2r)} (108 =) / 2(log ¢,)? 


when the area is measured in terms of the frequencies and the X-unit, G (log ¢,)/(log e), 
making the area, .4343nI/(G log ¢,) 








On the basis of the usual calculation of the Betas and the value of the 
moments (v) of the logarithmic normal curve from an origin of zero, as ! 
Um = Ga 

m 
it may be shown that the Betas have the values 


Professor H. L. Rietz in unpublished work has given a proof of this generalized equation of the 
moments. 



































American Statistical Association 


6, =a°+3a?—4=.134 
Bo = at+2a'+3a?—3 =3.238 


as compared with 8:=.039 and 62=2.258 by the usual ’method, or 
B,=.054 and $:=2.094 if Sheppard’s corrections are used. It is 
evident that with the higher moments the smooth curve gives in- 
creasingly divergent results. 

Other criteria, as ki=a? (2a?+a—3) and r=6a* (a+1)/ki, might 
be calculated; but it will perhaps be sufficient to indicate the general 


CHART III 


THE RELATION OF THE LOGARITHMIC NORMAL CURVE TO THE CURVES IN THE 
PEARSON SYSTEM 
The logarithmic normal falls between Types III and IV, and in the area of Type VI. 


0 SGALE OF wa TA ONE 





> 


Ww 


7 


SCALE OF BETA Two 


12) 














relation of the logarithmic normal curve to such criteria by means of 
Table V, which indicates their values for various values of log*e,, of 
which they are functions. Chart III shows the place of the same 
curve in the Pearson system, 





















th 











17] The Analysis of Frequency Distributions 365 


TABLE VI 
VARIABILITY OF LENGTH OF BUSINESS CYCLE IN THE UNITED STATES, 1796-1923 


(Mitchell, Business Cycles, p. 399) Analysis by logarithmic normal curve based on quartiles inter- 
polated by sloped frequencies. 


(1) (2) (3) (4) (5) 

















Duration No. of Cycles Normal Deviations (d) Fit 
Years (m) f F (2) —(3) d/PE 
1 1 .68 .32 .58 
2 4 5.21 —1.21 — .86 
3 10 8.15 1.85 1.11 
4 5 6.96 —1.96 —1.25 
5 6 4.65 1.35 1.00 
6 4 2.77 1.23 1.15 
7 1 1.58 — .58 —.70 
8 0 .88 — .88 —1.41 
9 1 .49 51 1.09 
10 0 27 —.27 —.77 
Q, =2.805 A =4.155 (4.031)* @=1.975 (1.686)* 
Q2:=3.672 G=3.752 (3.676)* ) =2.352 (.588)* 
Q3 =5.157 Mo=3.061 ( 


3.0455)* B2=7.453 (3.578)* 





* Usual method; substituting the fitted curve materially increases the higher moments. Sloping the 
frequencies does not change A, and decreases o only from 1.686 to 1.669. The correction for sloped 
frequencies is unimportant here, since the distribution is irregular and the number of classes adequate. 
The Betas by the usual method, without Sheppard’s correction, are .538 and 3.544 respectively. 


The foregoing calculations have been given rather to indicate meth- 
ods than to determine valid results for the data in question. We may 
therefore turn in conclusion to data which may more plausibly be 
regarded as belonging to the logarithmic type, namely, the distribution 
of business cycles in the United States classified by length in years, as 
given in Mitchell, Business Cycles, page 399. The data and the 
smoothed curve are presented in graphic form in Chart IV, and the 
various measures and criteria here discussed are given in Table VI. 
The calculations are as previously described, except that, inasmuch as 
the assumption of a logarithmic type is made, no correction is computed 
for the quartiles; that is, the curve is not shifted to a different position 
on the magnitude scale. In its given position the curve as determined 
by the quartiles is not exactly logarithmic, the logarithms of the 
quartiles giving a coefficient of skewness of .116.! In fitting the curve 
this error was distributed by basing the geometric mean on the quar- 
tiles, weighting the latter by their ordinates in a normal curve, that is: 


log G= (log Q:+log Q3+1.2554 log Qe) /3.2554. 


Otherwise, the computations were the same as those already described. 
The fit of the curve is satisfactory as indicated by the probable error 
of sampling. 


1 Experience with such curves indicates that this coefficient may range as high as .15 or even .20. 

















































366 American Statistical Association [18 


The differences between the Betas of the data and the Betas of the 
smoothed curve are apparently due chiefly to differences at the upper 
magnitudes where a slight shifting of the data makes marked differences 
in the higher moments. The discrepancy bears out the statement 
previously made that the moments are not very dependable criteria to 
use as a basis of curve fitting in economic data, particularly when their 
probable error is high or when, as so often occurs, their probable error 
is not a valid measure of their dependability because of linkages in the 


observations. 
CHART IV 
FREQUENCY DISTRIBUTION OF DURATION OF CYCLES IN THE UNITED STATES 
(SEE TABLE VI), TOGETHER WITH A FITTED LOGARITHMIC NORMAL 
CURVE, ASSUMING THE DATA TO BE OF THIS TYPE 


The correction for sloped frequencies was applied in the curve fitting, but is relatively unimportant 
in this case. 


























/0O- 
8- z 
H 
a 
© 
S4- DATA 
c LOG 
% NORMAL 
~» 
> 





/ 























Ss 


4/23 4 5 6 F 86 Y 
DURATION OF CYCLES - YEARS 


S 





19] 


AP 


ind 
col 
ap] 
spe 
rea 
ag 
hin 
cas 


cie 
val 
of 

abl 
or¢ 


cul 


in 

ind 
an 
me 
CoE 


the 
the 


ph: 
in 1 


of 


for 
the 
col 


tion, 


Coe! 





19] Limitations in Applicability of Contingency Coefficient 367 


A SECOND CATEGORY OF LIMITATIONS IN THE 
APPLICABILITY OF THE CONTINGENCY COEFFICIENT 


By J. ARTHUR HARRIS AND Cui Tu, University of Minnesota 


Pearson in his fundamental paper on the theory of contingency! 
indicated clearly the difficulties of comparing coefficients of association, 
correlation and contingency. He then wrote (p. 22): “The degree of 
approach of both C; and C, to the correlation must be studied for each 
special class of cases, and only when this has been done will their use be 
really legitimate and effective.’”’ This was written twenty-five years 
ago, but notwithstanding various subsequent contributions by Pearson 
himself, much remains to be done in the study of the special classes of 
cases to which it may seem desirable to apply contingency methods. 

This is unfortunate because of the fact that the contingency coeffi- 
cient is the most general measure of the relationship between two 
variables. The contingency coefficient is independent of the nature 
of the variables (whether quantitatively measurable or only describ- 
able in categories), of the form of the frequency distribution, of the 
order of arrangement of the classes, and of the nature of the regression 
curve. 

In an earlier paper the inapplicability of the contingency method 
in its usual form to cases in which z is the limiting value of y was 
indicated.2, The purpose of the present paper is to call attention to 
another category of cases in which the application of the contingency 
method may lead to coefficients differing widely from the correlation 
coefficient and the correlation ratio. 

The present category has in common with that already discussed 
the condition that certain cells of the contingency surface (showing 
the distribution of frequencies of individuals with the attributes z and 
y, each of several classes) are void of actual frequencies because of 
physical reasons but are necessarily assigned theoretical frequencies 
in the conventional method of calculating the independent probabilities 
of occurrence of combinations of xz and y. 

In the case already considered the void cells are distributed in the 
form of a triangle occupying the region of the surface above or below 
the diagonal row of cells, and the limitation in the applicability of the 
contingency method results from the fact that y is necessarily equal to 


1K. Pearson, ‘‘On the Theory of Contingency and Its Relation to Association aad Normal Correla- 
tion,” Drapers’ Company Research Memoirs, Biometric Series 1: 1-35, Pl. 1-2, 1904. 

* J. Arthur Harris and Alan E. Treloar, ““On a Limitation in the Applicability of the Contingency 
Coefficient,”” this JounNaL 22: 460-472, 1927. 








368 American Statistical Association [20 


or less than x. The present category differs from the former in the 
distribution and nature of origin of the void cells. In the present case 
the limitations to the occurrence of actual frequencies in certain cells 
distributed over the surface depend upon morphological limitations, 
or upon the necessity for recording the values of variates in discon- 
tinuous classes. 

RESULTS 


The category is illustrated by the following specific case. 

In certain morphological and evolutionary investigations involving 
the problem of the differential death-rate of the ovaries of Staphylea 
it has been necessary to determine the relationship between the locular 
composition! of the ovary and its radial asymmetry, or the root-mean- 
square deviation of the number of ovules from the mean number in 
the individual ovary. The reasons for dealing with these measures 
have been fully set forth elsewhere? and will not receive further atten- 
tion here, since the only present consideration is the discrepancy be- 
tween three measures of interrelationship as applied to a specific kind 
of frequency surface. 

A glance at Table I, which represents one of these series (A, B, C) 
of records fully described in the original memoir, shows that certain 
combinations of radial asymmetry and locular composition do not 
occur. The reason is a purely physical or numerical one. Ovaries 

TABLE I 
CONTINGENCY SURFACE FOR RELATIONSHIP BETWEEN RADIAL ASYMMETRY 
AND LOCULAR COMPOSITION IN STAPHYLEA (SERIES A) 


COEFFICIENT OF ASYMMETRY 


LE ouul ar 
Composition 





0000 | .4714 8165 9428 | 1.2472 | 1.4142 | 1.6330 | 1.6997 f= 8856 | Totals 














i 
3 even 0 odd. 462 595 
2 even 1 odd... 


| 130 | ; 2 
1 even 2 odd.. 5 i 
| 


_ 
w 
toto 
Ne 
~ 
hn 


: ‘ a0 | 57 73 
0 even 3 odd.. 103 os 35 . 1 139 





Totals. . . 














565 1057 233 165 43 22 | 3 6 pe i 2095 











which are made up of two locules with an even number of ovules and 
one locule with an odd number of ovules, or of one locule with an even 
and of two locules with an odd number of ovules, cannot have a coefi- 
cient of asymmetry of 0.0000, since one of the locules must differ from 
the other two by at least one ovule. Similarly, trimerous ovaries 
which are made up either of three locules with even numbers of ovules 


1‘*Locular composition”’ represents merely the number of cells or locules of the ovary with odd (5, 
7, 9, etc.) numbers of ovules. 

2 J. Arthur Harris, “On the Selective Elimination Occurring During the Development of the Fruits of 
Staphylea,”’ Biometrika 7: 452-504, 1910. 

J. Arthur Harris, ‘“‘The Selective Elimination of Organs,’’ Science, n.s. 32: 519-528, 1910. 





21) 


or. 
rac 
coe 


on 


by 


of 
pre 


gc 


wh 
ter 
qu 
an 
reg 


an 
the 


col 
me 
as) 
m¢ 





21] Limitations in Applicability of Contingency Coefficient 369 


or of three locules with odd numbers of ovules may have coefficients of 
radial asymmetry of 0.0000, 0.9428, 1.6330 or 1.8856, but cannot have 
coefficients of asymmetry of 0.4714, 0.8165, 1.2472, 1.4142, 1.6997, 
2.0548 or 2.1602, since to maintain the specified locular composition 
one of the locules must differ from the other two in number of ovules 
by at least two ovules. 

Now these physical limitations, imposed by the fact that numbers 
of ovules must be recorded in discontinuous classes, must influence 
profoundly the magnitude of the contingency coefficient. 

Remembering that for the contingency surface of p classes of z, and 
q classes of y \ 


Pp ( q , 9 
! Nny —N'ny)? 
v=S 4 (Rny — 2'nv)® + 


1\(1 ahaa ] 

where nm,» is the actual frequency of individuals having the charac- 
teristics of the nth class of x and the vth class of y, and n’,, is the fre- 
quency calculated on the assumption of complete independence of x 
and y, it is clear that the contribution of cells which are void for any 
reason will be 


P| 
S| = (n'ny) 
1} 1 
and that this may be a relatively large part of the value of x? on which 
th itude of ;, 1 
e magnitude o ‘ -( x2/N ) 
1 +x?/N 


depends. 

The consequences appear at once in Table II, which gives the 
correlation coefficients, correlation ratios, and contingency coefficients 
measuring the relationship between locular composition, c, and radial 
asymmetry, a, in the three series of ovaries. Here r,- is the product 
moment correlation coefficient, -y, is the correlation ratio' determined 


TABLE II 


COMPARISON OF CORRELATION COEFFICIENTS, CORRELATION RATIOS AND 
CONTINGENCY COEFFICIENTS MEASURING THE RELATIONSHIP BETWEEN 
RADIAL ASYMMETRY AND LOCULAR COMPOSITION 











Correlation | Correlation | Correlation | Contingency 
Series N Coefficient Ratio Ratio Coefficient 
Tea ea aN C; 
Eliminated co acchacas ereee ft . 2455 .4906 4541 .7082 
Developing (B)...... ere .3295 5447 4720 .7133 
Matured (C)........ Len ee 2704 . 3280 5112 .6603 .7073 























1K. Pearson, ““On the General Theory of Skew Correlation and Non-Linear Regression,”” Drapers’ 
Company Research Memoirs, Biometric Series 2. London, 1905. 











370 American Statistical Association [22 


from the means of the arrays of radial asymmetries associated with 
various classes of locular composition, ay. is the correlation ratio 
determined from the means of locular composition associated with 
various classes of asymmetry, and C;, is the first coefficient of con- 
tingency. 

The contingency coefficients are conspicuously larger than either of 
the other constants. They are about fifty per cent higher than the 
correlation ratios, and are nearly three times as large as the correlation 
coefficients. 

There can be no doubt that the high value of the contingency coeff- 
cient as compared with the other measures of interrelationship is due 
to the contribution to the value of x? of the cells which for physical 
reasons cannot contain frequencies. The percentage contributions of 


these classes are: 
Total Contribution Percentage 


x? of void cells contribution of 
void cells 
i hha chia incre ge el 2108.36 953.74 45.24 
a ree Be Sc Bite Ir fear lid ca ordi 2552.99 1141.97 44.73 
AREER org oc 2706.79 1352.01 49.95 


Thus about half of the total value of x? is contributed by the void cells. 

It is clear that as a measure of the interdependence of x and y, the 
value of C, is largely determined by factors which are rigidly fixed 
by the conditions inseparable from the numerical-morphological 
characteristics of the organ on which the coefficient is based. We 
shall show in a moment that to some extent this is also true of the 
correlation coefficient and of the correlation ratios. 

In attempting to free contingency surfaces of the kind here under 
consideration from the limitation due to the existence of necessarily 
void cells, two wholly empirical procedures may be followed. 

First, we may condense the theoretical frequencies of void cells into 
the cells in which frequencies can actually occur. 

An indefinitely large number of methods of distributing the the- 
oretical frequencies of the void cells among the occupied cells are pos- 
sible. In the absence of any certainty as to what is the theoretically 
best method, some one procedure which has at least the merit of being 
consistently applicable must be tested. The concentration of the the- 
oretical frequencies from the void cells into the occupied cells in propor- 
tion to the numbers theoretically already there present affords such a 
method. Thus the corrected theoretical frequency for any occupied 


cell will become N 


, 
nr =n - 
ny "Sy (n' ay) 








23] 
wh 
cor 


onl 
me 


lim 


suc 
the 


Alter: 




























23] Limitations in Applicability of Contingency Coefficient 371 


where n’’,, denotes the theoretical number in the occupied cells after 
correction and S, denotes summation for the theoretically void cells 
only of the whole surface. Whatever may be the objections to this 
method, it meets the necessary requirement S(n”,y) =N. 

Second, we may attempt to remove the numerical-morphological 
limitation to its applicability by splitting our materials up into classes 
such that the specified limitation will not obtain. By separating 
the classes of locular composition into two groups 


Group A: 3 odd, 3 even 
Group B: 2 odd+1 even, 1 odd+2 even 


we may divide each contingency surface into two bi-serial tables each 
of which will have a smaller number of asymmetry classes but each of 
which will represent distributions which may have frequencies in 
every cell.' Thus the (4 x 9)-fold surface given in Table I may be 
broken into the (2 x 4)-fold and the (2 x 5)-fold distributions shown 
as Tables III and IV. 
TABLE III 
BI-SERIAL CONTINGENCY SURFACE FROM TABLE I 


R ADIAL ASYMMETRY 






































j 
Locular Composition | .0000 | .9428 | 1.6330 1.8856 E Totals 
3even..... | 462 130 2 1 | 595 
3 odd 103 35 1 | 139 
BONO osc cccs 565 165 im 3 | 1 mm. 7 34 
TABLE IV 
BI-SERIAL CONTINGENCY SURFACE FROM TABLE I 
RADIAL ASYMMETRY 
— — a —a — — ————— — y — — ——<— 
Locular Composition 4714 8165 1.2472 | 1.4142 | 1.6997 |: Totals 
_ | | 
7 
J. ee 614 | 138 | 21 | 14 | 1 788 
leven 2 odd.... _ 443 95 | 22 | 8 5 } 573 
Totals. eae | 1057} 233 | 43| ~~ 22] 6| 1361 








The relationship between the two variables (locular composition 
and radial asymmetry) involved in such tables might be treated by 
Pearson’s methods of determining the bi-serial r,? or the bi-serial ,° 


1 For this suggestion we are indebted to Mr. Alan E. Treloar. 

? K. Pearson, “‘On a Novel Method of Determining Correlations Between a Measured Character A, 
and a Character B, of Which Only the Percentage of Cases Wherein B Exceeds (or falls short of) a 
Given Intensity is Recorded for Each Grade of A,’ Biometrika 7: 96-105, 1910. 

*K. Pearson, ‘On a New Method of Determining Correlation When One Variable is Given by 
Alternative and the Other by Multiple Categories,’ Biometrika 7: 247-257, 1910. 














372 American Statistical Association [24 


bearing in mind any limitations in the applicability of these methods 
inherent in our present data. 

For each bi-serial table (Tables III and IV) we calculate x?. Since 
the values of C; may not properly be determined by the usual theory 
for a (2 x 4)-fold or (2 x 5)-fold table, we may base our first evalua- 
tion of the relationship between a and c in such bi-serial tables on P,' 
as is wholly legitimate for such cases. 

The results appear in Table V. Here the value of P is determined 
from Elderton’s table? with n’ equal to the number of cells of the 
bi-serial distribution. The procedure is not in accord with R. A. 
Fisher’s conclusion’ that for (r x s)-fold tables n’=(r—1) (s—1)+1. 


TABLE V 
































Series Classes N | x P | C; 
| 
Eliminated (A)....... 30, 3e 734 | 1.3791 981564 0433 
o0+2e, 20+e 1361 6.1140 728201 0669 
Total Surface 2095 7.4905 975209 0597 
| 
Developing (B)..... Yer 30, 3e 896 22.6940 .001961 . 1572 
0+2e, 20+e 1569 16.2103 238653 1011 
Total Surface 2465 38.9476 016335 1247 
| ———— 
| 
Es a ee 30, 3e 1358 .4111 | .984612 0549 
| Large 
0+2e, 20+e 1346 .9805 .994930 0850 
| arge 
Total Surface 2704 1.3902 .999910 0715 





The values of P for the bi-serial distributions suggest rather too close 
agreement between observation and theory, rather than the divergence 
of the observed from the theoretical system as indicated by the above 
values of C; for the surface as a whole. 

But our real need is not fully met by the information provided by 
these bi-serial tables as such. We require some measure of contingency 
for the surface as a whole which will be free from the influence of the 
contribution of the void cells. Let a denote the bi-serial table of Ng 
entities for cases in which all the cells of the ovary have the same 
locular composition and let 8 designate the bi-serial table of Ng 
entities in which the cells are not of the same locular composition. 

Now in these two bi-serial distributions, a and 8, into which the 

1 K. Pearson, ‘On the Criterion That a Given System of Deviations from the Probable in the Case of 
a Correlated System of Variables is Such That It can be Reasonably Supposed to Have Arisen from 
Random Sampling,’ Phil. Mag. 50: 157-175, 1900. 

2 W. P. Elderton, ‘Tables for Testing the Goodness of Fit of Theory to Observation,”’ Biometrika 1: 
155-163, 1901. Reprinted as Table XII in Pearson’s Tables for Statisticians and Biometricians. 


*R. A. Fisher, ‘On the Interpretation of x? from Contingency Tables and the Calculation of P,” 
Journal of the Royal Statistical Society 85: 87-94, 1922. 








sul 
dis 
Fo 


th 
as 


th 
fin 


CO: 


Eli: 
Der 
Ma 


en 
lov 


we 


SU. 


332 








25] Limitations in Applicability of Contingency Coefficient 373 
surface has been broken, the theoretical frequencies have been properly 
distributed in cells in which observational frequencies may occur. 
For each of the two bi-serial distributions Sy(ppy )=1, Se (pay) =1. 

If these bi-serial tables were merely again combined in tables of four 
classes of locular composition, Sg (Pay) +Ss (Pay) would be 2, and we 
would furthermore give no weight to the difference between Na and 
Ng. We therefore take 


” N. ” Nz 
Pp a 7 Prv» Pp By Pav 


which fulfill the necessary requirement S (p’’) =1. 

This leaves the values of n’,, for the whole surface identical with 
those calculated for individual bi-serial tables, and for the surface 
as a whole x?=x7g+ xg". 

Comparing the results for those two methods of redistributing the 
theoretical frequencies with the (spurious) values of C; given above we 
find the values shown in Table VI. 


TABLE VI 
COMPARISON OF CORRECTED AND UNCORRECTED CONTINGENCY COEFFICIENTS 








































Method A 
- ey B Probabilities com. 
— eoretical frequen- | put or filled cells 
—_ Original : cies computed from » by taking 
Series Contingency separate bi-serial 
1 i.e., from p’’ 4, Dg Pay ™ Pa, So 
C; v ¥S,(P,y) 
1 
ES ee .7082 .0597 .0931 
OE ee .7133 . 1254 . 2864 
Matured (C)...... aes at a ene .7073 .071 .0231 

















The elimination of the influence of the void cells by these two wholly 
empirical methods has reduced the contingency coefficients to very 
low magnitudes. 

Returning to the coefficient of correlation and the correlation ratios, 
we note a large discrepancy between the two sets of values. 

That these are due to the nature of the regression is suggested by 
the results for the contingency coefficient, and is made quite evident by 
a comparison of the empirical means with the regression straight lines, 
which need not be represented here. 

Blakeman’s test for linearity of regression’ gives the following re- 
sults. 


‘J. Blakeman, “On Tests for Linearity of Regression in Frequency Distributions,’ Biometrika 4: 
332-350, 1905. 
























American Statistical Association 


For cna For ane 
i a te ec 017.31 14.46 
Se tte Si bets aL Meal ar cat 19.08 13.82 
Ls ERE eee ee eam Pree 16.08 27.53 


These ratios can leave no question concerning the significantly non- 
linear nature of the relationship between locular composition and radial 
asymmetry. Non-linearity of regression, is indeed, morphologically 
necessary. 


DISCUSSION OF RESULTS 


Consideration of the biological interpretation of the results from the 
botanical data necessarily employed falls outside the scope of this 
paper. We note the following results of interest from the standpoint 
of statistical methodology. 

The three statistical procedures here applied to the measurement of 
the interrelationship between locular composition and radial asym- 
metry in the ovary of Staphylea lead to very different numerical values. 
The orders of magnitude of the constants are so diverse, and the 
results for any one method so consistent from series to series, that it is 
not necessary to compare the coefficients due to the three methods 
with regard to their probable errors in order to assert the significance 
of their differences. 

The difference between the correlation coefficient, r.. and the ratios, 
cNay aN, are clearly attributable to the conspicuously non-linear 
nature of the regression of these two characters. This point requires 
no further consideration. 

The contingency coefficient, C;, is roughly fifty per cent higher than 
the correlation ratio, cy, an-, and about two or three times as 
large as the correlation coefficient, r,. This high value of the con- 
tingency coefficient is primarily attributable to the high percentage 
contribution of void cells to the value of x?. Two empirical methods 
of redistributing the theoretical frequencies from the cells which are 
void because of physical reasons have been suggested and tested, with 
the result that the ‘‘corrected”’ contingency coefficients are reduced 
to very low values. The present case represents, therefore, a second 
class of limitation in the applicability of the contingency method as at 
present developed and applied. From these results, as well as from 
those of a preceding paper! it is evident that in the use of the con- 

tingency coefficient in biological or other investigations care should 
be used to ascertain whether any of the cells of the surface are neces- 
sarily void. 


1 Harris and Treloar, loc. cit. 








re: 


co 
be 
ca 
co 
me 
fol 
tic 
re] 
go) 








27] Limitations in Applicability of Contingency Coefficient 375 


These results should not, however, be used as a basis for the un- 
restricted criticism of the contingency method. In many cases it is 
the only method available for dealing with the full ranges of the catego- 
ries of x and y. Furthermore much depends upon the purpose for 
which it is to be applied. If the thing which is required in the investi- 
gation in hand be a measure of the deviation of the given system from 
independent probability, irrespective of the nature of the origin of this 
deviation, the contingency coefficient possibly furnishes the best 
measure of the three (r, 7 and C;). It must be remembered that our 
primary problem is the comparability of the results of the various 
methods of measuring interrelationship. 

While these limitations in the contingency coefficient (as a constant 
comparable with the correlation coefficient or correlation ratio) have 
been tested on the basis of data expressed on a quantitative scale, there 
can be no objection to this procedure, which is the necessary one if 
comparisons are to be made between contingency coefficients and other 
measures of interrelationship. The demonstration of inconsistencies 
for quantitatively measured variables may preclude errors of interpreta- 
tion in some cases in which contingency methods are applied to surfaces 
representing the relationship between two variables classed in cate- 
gories. 

































American Statistical Association 


HORSEPOWER STATISTICS FOR MANUFACTURES! 


By Warp L. Tuorp, Amherst College 


The census records of horsepower are our one great source of data 
concerning industrial equipment. They are put to all manner of uses, 
from estimating national savings to examining the productivity of 
capital. The figures for 1927, now available, indicate that the out- 
standing gain since 1925 among the various items covered by the 
biennial census was in the horsepower of prime movers. It is therefore 
extremely important that statisticians should examine these data 
critically.2, This paper will present a recomputation of the census 
data, and then will discuss the significance and usefulness of horsepower 
statistics as at present defined. 


AN INDEX OF HORSEPOWER 


The earliest inquiry into motive power appeared in the census of 
1850 under the heading, ‘‘ Kind of motive power, machinery, structure 
or resource.”” That a quantitative investigation was not intended is 
evident from the instructions to the assistant marshals who acted as 
enumerators, suggesting that the entry made on the schedules read 
“water, steam, horse, wind or otherwise as the fact may be.” This 
same inquiry appeared in the census schedules of 1860. In 1870, actual 
statistics of horsepower were collected for the first time. A separate 
section of the manufactures schedule dealt with motive power. The 
instructions to enumerators included the direction to obtain ‘‘if steam 
or water, the number of horsepower.’”’ This was further elaborated in 
1880, and the instructions to enumerators, referring to the horsepower 
of steam or water prime movers, said, “This is an inquiry of great 
importance. The best information available should be used in filling 


1 The author is indebted to Mr. LeVerne Beales, Chief Statistician for Manufactures, Bureau of the 
Census, for reading the manuscript and making numerous suggestions leading to greater accuracy and 
precision. 

2 The uncritical use of these data is not limited to amateur statisticians. In The Growth of Manu- 
factures, 1889 to 1923, by Edmund E. Day and Woodlief Thomas (Census Monograph VIII, Washing- 
ton, 1928), the authors present rates of increase for various periods without suggesting the qualifications 
developed in this paper. Likewise, the now repentant author must plead guilty of the same misplaced 
faith in the data when presented in The Integration of Industrial Operation (Census Monograph III, 
Washington, 1924). Both of these studies remark the changed scope of the census, but all possible 
errors are incorporated in a recent study by Dr. Carroll R. Daugherty, The Development of Horsepower 
Equipment in the United States (U. S. Geological Survey Water Supply Paper 579, and reprinted as 3 
Thesis of the University of Pennsylvania). The absence of critical discussion in the publications of the 
Census Bureau prior to the 1927 Census of Manufactures has misled many authors who have lifted the 
data directly from that source, for example, Marshall and Lyon, Our Economic Organization, p. 219; 
Bogart and Landon, Modern Industry, p. 393. 








the 
po 
an 
Sit 
int 
tio 


val 


the 
rec 


the 
per 











29] Horsepower Statistics for Manufactures 377 


these columns.”’ The census of 1890 expanded the inquiry to include 
power supplied by outside sources, and also power other than steam 
and water, suggesting that the entries read “electric, gas, or other.” 
Since that time, there has been little change in the character of the 
inquiry into power, although the subdivisions of the published tabula- 
tion have been revised somewhat. The census of 1921 completely 
omitted the inquiry concerning prime movers “in the interest of econ- 
omy in both time and expense.”” Since then, two minor subdivisions 
have been dropped from the compilation, namely, the rated capacity of 
water motors, and rented power other than electric. 

Not only have changes taken place in the character of the power 
inquiry, but the scope of the inquiry has also undergone variations. 
This is particularly true of the earlier years. The census of 1880, for 
example, completely omitted nine industries from the power tabulation. 
Some of these were of considerable importance, such as glass, gas, and 
distilled and malt liquors. The census of 1890 included an industry 
called “ Electric light and power’’ among the manufacturing industries. 
It was never included before, nor has it appeared since. All such 
variations, where known, have been eliminated in one way or another 
in Table I. However, certain general changes in scope took place at 
the censuses for 1899 and 1919, for which it is impossible to make cor- 
rections. Consequently, the data must be given in three sections, with 
the year of change as an overlap. The general character of each such 
period is as follows: 


1869-1899. Manufacturing including hand and neighborhood 
industries except cotton ginning, for which data were not 
collected prior to 1889. Some slight inconsistency is present, 
due to the restriction of the first two censuses to steam and water 
power created at the plant. The inclusion for 1889 of data for 
other types of prime movers and for purchased power added 
102,000 horsepower, or 1.8 per cent of the total. 

1899-1919. Manufacturing under the factory system, including 
all establishments whose value of products was $500 per annum 
or more. 

1919-1927. Manufacturing under the factory system, including 

all establishments whose value of products was $5,000 per an- 

num or more. However, the 1919 power figures have not been 
corrected for the reduced scope of the census. They therefore 
include data for horsepower in establishments where the product 
is $500 or more but less than $5,000. In 1919, such establish- 
ments employed but 0.5 per cent of the wage earners and pro- 










































American Statistical Association [30 


duced but 0.4 per cent of the total value of products. Their 
proportion of total horsepower was probably even less, and is 
stated by the Bureau of the Census to have “only a negligible 
effect on the comparability of the statistics.’ During this 
period, water motors and rented power other than electricity are 
not included. The automobile repairing and poultry killing and 
dressing industries are not included. 


TABLE I 
PRIMARY HORSEPOWER USED IN MANUFACTURING INDUSTRIES, 1869-1927 









































Primary horsepower 
Amount Increase Index of 
Census year , on horsepower 
1899=100 
(Thousands of (Thousands of (Per cent (Per cent 
horsepower) horsepower) for period) per annum) 
1869. . 2,317 ® 21.1 
3,547 b 1,230 53.1 4.4 | 32.3 
—Eaalehadobae 5858 © 2'312 65.2 | 51 | 533 
ere | 10,985 5,127 87.5 6.5 100.0 
1899... 9.9414 | | 100.0 
re 13,488 3,547 35.7 6.3 | 135.7 
say 18,675 5,188 38.5 6.7 187.9 
22,437 j 3,762 20.1 3.7 225.7 
1919. . 29,505 7,068 31.5 5.6 296.8 
ESS 29,309 e 296.8 
a ‘ 33,092 3,784 12.9 3.1 335.1 
ee 35,804 & 2,712 8.2 4.0 362.6 
ee 38,826 3,022 8.4 4.1 393.2 








® Excluding quartz milling, lead pig, and sugar made on plantations, not included hereafter. 

b In the original tabulation, the following industries were omitted: bottling, car repairing, coke, dye- 
ing and finishing textiles, gas, glass, liquors distilled, ane malt, shipbuilding. Data for the glass 
Sollesiey were subsequently published. For the other industries, an estimate was made on the assump- 
tion that their increase from 1869-1889 was divided between the two decades in the same proportion as 
the increase for all other industries for the same period. 

¢ Excluding electric light and power. 

4 Using the official correction for custom sawmills and gristmills, and the figure of 157,000 horsepower 
for other hand and neighborhood industries, suggested in the census of 1905. 

¢ Including establishments whose annual product is $500 or more but less than $5,000. See text for 
importance of this group. An arbitrary sum of 15,000 horsepower was deducted to correct for water 
motors, included here for the last time. In 1909, water motors totaled 15,449 horsepower. The total 
for all forms of water power declined by 3.2 per cent from 1909 to 1919. 

{ No data concerning power collected in the census of 1921. 

® Coffee and spice roasting and grinding added, totaling 37 thousand horsepower. 


Perhaps the most significant columns in Table I are the last two. 
The index of horsepower indicates the extraordinary increase in the 
utilization of power machinery during this period. But while the in- 
crease in recent years continues to be enormous, it is important to note 
that the rate of increase is less. The War appears to have speeded up 
the process somewhat, but the years immediately afterward recorded 4 
compensating retarding of power extension. The turning point ap- 
pears to be about 1909. For the two preceding decades, the annual 
rate of increase had been approximately 6.5 per cent. Since then, it 
has averaged slightly under 4.2 per cent per annum. 








am 
ele 
hor 
Bu 


7 
effi 
ture 
pov 
wit 

A 


and 
rap 
tert 
hor: 
cha 
tab 
yes 

PER 





1889. 
1899 


1919. 











31] Horsepower Statistics for Manufactures 379 


THE SIGNIFICANCE OF HORSEPOWER DATA 


Although we are all aware of the changes which have transpired 
among power engines, especially with the increased application of 
electricity, we have usually disregarded them on the assumption that a 
horsepower was a horsepower, regardless of its parentage. The 
Bureau of the Census refers to horsepower as follows: 

Primary power, as the term is used by the Bureau of the Census, comprises 
all power which is primary from the standpoint of the manufacturing establish- 
ments using it. It includes, therefore, not only the power of engines and water 
wheels owned and operated by the manufacturing establishments, but also 
rented power—that is, the power of electric motors run by purchased current. 
. . . Primary power does not include the power of electric motors which are 
run by current generated in the same establishment, since the inclusion of such 
power would result in duplication. 


The use of rated capacity further makes for comparability, since the 
efficiency of the power-generating machine does not enter into the pic- 
ture. A 500 horsepower engine is one which can deliver 500 horse- 
power of energy, though one type of engine may accomplish this result 
with much less fuel consumption than some other. 

Although the above statements as to the comparability of the unit 
and the similarity of the point of measurement are both true, the very 
rapid growth of electric power, as distinct from steam, water and in- 
ternal-combustion engines, has definitely affected the nature of the 
horsepower tabulation in several ways. The rapid advance of pur- 
chased electricity among the forms of primary power is indicated by the 
tabulation in Table II of percentage distribution for various census 
years. 














TABLE II 
PERCENTAGE DISTRIBUTION OF PRIMARY POWER BY TYPE OF PRIME MOVER! 
Purchased Power 
’ ‘ Internal | 
Census year Steam | Water oontinatinns 
Electric | Other 
| ESE eee ee Sere Ls eee 77.3 21.1 0.2 | 1.5 
EEC Pee 81.1 14.4 1.3 1.8 1.4 
CS ee ee eee nae 76.2 9.8 4.0 9.4 0.7 
SR er a ea 57.8 6.0 4.3 31.7 0.3 
a Na iD a mea cil aiid aroha at ace ats 50.5 3.7 5.5 40.4 
ERASE SES Se es eben oe ene 47.3 3.3 5.0 44.4 
SSA ee he ae 43.6 3.0 4.1 49.3 




















! Census data used have not been adjusted as in Table I. 


This tremendous increase in the use of electricity has introduced at 
least three new factors into the situation. Two have tended to reduce 
the total statistics of horsepower, and one has tended to increase the 
totals. 



















































American Statistical Association 


380 





(32 


The first new factor is a difference in the significance of “rated 
capacity’’ when applied to steam engines and to electric motors,! 
Rated capacity for water wheels and steam engines represents in gen- 
eral the maximum load which they can carry. The limiting conditions 
are the safe maximum steam pressure and the maximum portion of the 
piston stroke during which steam can be admitted by the valves, 
Under a much heavier load, they will presumably stop. However, it is 
possible to run an electric motor for short? periods of time at consider- 
ably more than rated capacity. The situation is described by the 
Westinghouse Electric and Manufacturing Company as follows: * 


For electric motors, the usual limiting factor is the temperature at which the 
insulation of the motor is destroyed and a motor can safely carry much greater 
overloads in a cold windy place than in a hot stuffy place. Most electric motors 
will carry two or three times their rated horsepower without being stopped by 
the load, but the length of time they will carry it is dependent on how the insula- 
tion withstands the temperature attained. As long as no part of the motor 
exceeds about 90° to 100° C., no damage will be done. 


To apply the above information to a specific case, suppose a factory 
where the requirement is normally 100 horsepower, with an occasional 
brief peak load of perhaps 250 horsepower. An electric motor with 
rated capacity of 100 horsepower could meet the requirements, but 
a much more powerful steam engine would probably be needed for the 
same conditions. A steam turbine would probably fall between the 
reciprocating steam engine and the electric motor in its capacity for 
overload. In industries, therefore, where power requirements are 
erratic, the substitution of electric power for other forms has made 
possible an apparent reduction in the horsepower of prime movers, 
though the actual power requirements may remain the same. 

A second factor affecting the horsepower statistics has been the 
substitution of electrical power transmission within the plant for the 
older forms of mechanical transmission. The old belt-shaft-and-belt 
method has given way in many establishments to the use of electric 
generators transforming the mechanical energy to electrical energy 
which is then transmitted to motors placed throughout the shop. 
While improvements in the efficiency of prime movers themselves are 


1 For assistance at this point the author is indebted to Professor William W. Stifler and Mr. Theodore 
Soller of the Department of Physics, Amherst College. 

2 Dean Dexter S. Kimball of the College of Engineering, Cornell University, suggests that it is 
probable that for long periods of time the steam engine can carry a larger overload than the electric 
motor without injury. 

* Letter to the author, September 5, 1929. A similar inquiry to the General Electric Company 
brought the reply that most electric motors have a maximum 1-minute overload capacity varying from 
50 per cent to 150 per cent or even more above its rated capacity. Furthermore, motors are sometime 
designed to have overload capacities of 25 per cent or 50 per cent or other values for a period of 2 hours, 
or any other definite time specified. Rated capacity always refers to continuous operation. 











33] 


not 
of ti 
will 
acco 
tran 





prin 
men 
mec. 
per | 

It 
in T 
miss 
forn 
hors 
this 
pow 
men 

H 
duce 
estir 
be n 
in e: 
area 

T 
pow 
tran 
met! 











33] Horsepower Statistics for Manufactures 381 


not significant in our problem, any means for improving the efficiency 
of transmission of the power-product to the place where it is needed 
will have a direct effect in reducing the primary horsepower required to 
accomplish a given amount of work. The development of electrical 
transmission is depicted in Table III. 


TABLE III 


COMPARISON OF PRIMARY POWER-GENERATING MACHINERY AND ELECTRIC 
MOTORS RUN BY GENERATED POWER, 1899-1927 











Primary Electric motors 
power-generating driven by current 
machinery generated in plant 


Census year 





(Thousands of horsepower) 





1899 ca a. - 9,627 310 
1909 : ee 16,803 3,068 
1919 : so , — 20,026 6,969 
1923 a iar aera ae anata = 19,728 8,821 
1925 . cacneees 566 SR dORR SO OROD ROMER ORE 19,906 10,262 ® 
SG ss ug clea blew wea ea eh 19,693 11,207 

















* Coffee and spice roasting and grinding estimated at 7 thousand. 


Since 1919 there has been a slight decrease in the capacity of 
primary power-generating machinery within manufacturing establish- 
ments. The electric motors driven by power generated from the 
mechanical power produced by this same machinery, have increased 61 
per cent during the same eight years. 

It must not be assumed that the difference between the two columns 
in Table III represents the power applied by other methods of trans- 
mission. ‘There is considerable loss of power in the process of trans- 
formation, which would make necessary a somewhat greater primary 
horsepower than the capacity of electric motors which could be run by 
this power. But, more important, is the fact that the primary horse- 
power is often below the capacity of the electric motors in an establish- 
ment because they do not all run at once, nor at full capacity. 

How much the horsepower of the country has been potentially re- 
duced by the introduction of electrical transmission cannot even be 
estimated. In some instances, an efficient belt and shaft system may 
be no more wasteful of power than electrical transmission. However, 
in establishments which require flexibility or which cover a large floor 
area, the use of electricity may result in considerable saving. 

There can be no doubt but that during the period for which we have 
power statistics there has been marked improvement in the methods of 
transmission. It is important to note, however, that even the best 
methods of electrical transmission result in considerable loss of power. 





































382 American Statistical Association [34 


The following material from a letter from the Westinghouse Electric & 
Manufacturing Company indicates the extent of this loss: 


Taking the indicated horsepower of a steam engine as 100 per cent, the effi- 
ciency of the transformation steps is about as follows: 


Efficiency Overall 

of device efficiency 
Steam engine (mechanical efficiency).......... 83 to 90% 83 to 90% 
oe 85 to 95% 70.5 to 85.5% 
Transmission of electrical energy.............. 90 to 100% 63.4 to 85.5% 
NN ee ar Te ee 50° to 95% 31.7 to 81.2% 


Taking an example: A 100 H.P. prime mover, at 100 H.P. as tested by a 
steam engine indicator, would probably be delivering about 90 “brake horse- 
power” to the electric generator, which in turn would deliver about 84 horse- 
power or 62.6 kilowatts of electrical power. Of this amount, assuming local use, 
probably not less than 98 per cent, or 82.3 H.P., would be available as input 
to an electric motor, which could deliver about 74 H.P. actual mechanical power 
output. For larger machines the efficiencies might be little higher and for small 
machines they would be much less. 


The two factors in the problem which have been discussed, the 
possibility of overload and the development of electrical transmission, 
have probably tended slightly to reduce the horsepower record of 
industry. But the third, and by far the most important, has been 
much more than an offsetting factor—the increase in the purchase of 
outside power. The net changes in the balance sheet of prime movers 
between 1919 and 1927 are as follows: 





Gain Loss 
Thousands of horsepower 

oo es Ciaran wtwie-de marae Pathe 3,696 
ta ah i ote 3,585 
Internal-combustion engines................. 71 
Water wheels and water turbines............ 152 
Electric motors driven by purchased current. .. . 9,850 

13,435 3,919 


The horsepower of electric motors driven by purchased current has 
more than doubled since 1919, and is now approximately one-half of all 
primary horsepower. 

The census data refer to rated capacity. In the case of purchased 
power, the measure is the rated capacity of all motors which are driven 
by this purchased power. But it often happens that the capacity of 
such electric motors, in total, far exceeds their actual collective power 
requirements. They may not all be run at once nor may they all be 
run at full capacity. It is by no means unusual for a steam engine of 


* Only in small, fractional horsepower motors is the efficiency as low as 50 per cent. In motors of 4 
few horsepower and on up the efficiency will be above 80 per cent. 








me 


e, 


wo fF, ef HA et otf 


eed et et A et FD) 


-— fa tae A SS @ © 





35] Horsepower Statistics for Manufactures 383 


50 horsepower capacity to furnish the primary power in a plant to 
electric motors with 150 horsepower rated capacity. The primary 
power is then reported as 50 horsepower. If the steam engine is dis- 
mantled and power purchased from outside, the primary horsepower 
would then be recorded as 150 horsepower. 

The facts which bring about this condition were recognized in the 
Census Bureau in 1910, though the implications were not publicly 
suggested at that time. In the census of 1910 the phenomenon is 
described as follows: 

In the foundries and machine shops of Massachusetts, for example, there 
were in 1909 electric motors operated by current generated in the establishment 
with an indicated horsepower of 104,727, or nearly twice the total primary horse- 
power of the industry in the state, which was 52,802. . . . It isevident, therefore, 
that the horsepower of these motors has no necessary relation to the total amount 
of power used (the total primary power). 

An examination of the records of industries for 1927 indicates that 
there are thirty-five in which the rated capacity of motors operated by 
power generated in the establishments exceeds the rated capacity of the 
prime movers available for creating this power—a seeming defiance of 
the law of conservation of energy. To make this situation absolutely 
clear, the data for the ten industries in which the discrepancy is 
greatest is given in Table IV. 


TABLE IV 
COMPARISON OF POTENTIAL GENERATING HORSEPOWER AND ELECTRIC 
MOTORS DRIVEN BY GENERATED POWER 
TEN INpusTRIES, 1927 








Electric motors driven by 
a power generated at the plant 
Industry (excluding purchased 
power) 





Per cent of 


Amount primary power 





216 
214 
199 
182 
175 
168 
164 


Screws, wood 

Photographic apparatus and materials. . . 
Carpets and rugs, wool aa 

Wrought pipe.......... 7 ae 
Engines, turbines and water wheels... ... 
Labels and tags cst ath 
Coffee and spice, roasting and grinding. . 
Billiard and pool tables, bowling alleys, etc. 160 
Aluminum manufactures.............. : 149 
Oilcloth 2,26 3, 147 














1 That the implication is at present clearly understood in the Census Bureau is indicated by the fact 
that the consideration of the effect of the increase in rented power was first called to the attention of the 
author by Mr. LeVerne Beales, Chief Statistician for Manufactures. It is officially recognized in the 
report on prime movers of the census of 1927, published after this article was written, which explains the 
difficulty and states: ‘The maximum demand of the group of machines, which the steam engine must be 
able to meet, would be materially less than the sum of the maximum demands of the individual machines, 
which the motors must be able to meet. Thus, the increase in the total rated capacity of prime movers, 
coming, as it does, wholly from new installations of electric motors, is not a true measure of the increase 
in the consumption of mechanical power in manufacturing establishments.” 














384 





American Statistical Association [36 


The discrepancy is in reality even greater than is indicated by the 
table. The discussion at an earlier point in the paper indicated that 
perhaps twenty per cent of the horsepower of the original prime mover 
is lost in the process of transforming it to electrical power. Further- 
more, no allowance has been made for the fact that within these indus- 
tries, some plants doubtless use all or a part of their steam, water, or 
internal-combustion-engine power directly in its mechanical form. 
And furthermore, some of the current generated in the plants is used 
for light, heat, and electro-chemical purposes. It is therefore quite 
possible that, on the average for all industrial plants, a one-horsepower 
prime mover motivates electric motors rated at one and one-half 
horsepower. The ratio may be even higher than this. 


TABLE V 


PRIMARY HORSEPOWER IN MANUFACTURING INDUSTRIES, OWNED AND 
PURCHASED, 1899-1927 » 




















Amount Increase 
Census year 
(Thousands of | (Thousands of (Per cent (Per cent 
horsepower) horsepower) for period) per annum) 
Owned: 
icin stim atiae 4 me eialie nadia — i ee i _ 
EG pn Ack nrg Wein waetaate 12,855 3,228 33.5 6.0 
ies i er ter eh adi wii me outed 16,803 3,948 30.7 5.5 
ee ree 18,410 1,607 9.6 1.8 
RG ahi erudite Seba Wiens 20,063 1,653 9.0 1.7 
ies ai ac Ga lec re ae ES ee ‘ 
 , Sry yee rey 19,728 —298 —1.5 —0.4 
Bgee¢..... -oheweaweeieaeas 19,906 178 0.9 0.4 
NG A Ne Gs ass one ede ease 19,693 —213 —1.1 —0.5 
Purchased: ¢ 

eee honey eC aid Ms Mee ‘ 
Rit tik tantd wand) cee a em warn 633 319 101.5 15.0 
A ane aatat ia ews @ a ela ag 1,873 1,240 195.9 24.2 
Nai ae ere ae se es 4,027 2,154 115.0 16.5 
i WA diaraanou ih eked ehaae ae 9,442 5,415 134.5 18.6 
ik tei ialn coon iad dre Raabe ated SS nr a ; 
Sr eran oe 13,365 4,082 44.0 9.5 
OP eae 15,898 2,533 19.0 9.1 
SEES eee eee ee 19,132 3,234 20.3 9.7 

















* See Table I for footnotes. 

b Adjustment for hand and neighborhood industries distributed in the same proportion as factories 
operating under factory system. 

¢ Poultry killing and dressing estimated at one thousand for each. 

4 Coffee and spice roasting and grinding estimated at 4 thousand owned and 33 thousand purchased. 

¢ For the period 1899-1919 rented power other than electric is included, which amounted to 43 per 
cent of the total in 1899 and 1 per cent in 1919. 


The condition outlined above would not demand serious considera- 
tion if owned and purchased power had increased at the same rate. 
The records of growth of the two are shown in Table V. 


It should be evident from the above discussion that if the require- 
ments for power in the country did not increase in any way, but there 








37] 


pu 
W 
sel 
ch 
Pr 
19 
tw 
pu 
ré 

ea 
pe 


sit 


th 
tu 








37] Horsepower Statistics for Manufactures 385 


were a gradual replacement of power generation at the plant, by the 
purchasing of power, the records of horsepower would advance. 
While such replacement has been going on to some extent, the most 
serious misconceptions created by the horsepower data arise from the 
change in the nature of power additions, as indicated in Table V. 
Prior to 1909, the increases reported were chiefly of owned power; since 
1909, the advance has been dominated by purchased power. But the 
two are not comparable. When compared with the owned power, the 
purchased-power records may be termed inflated. Gonsequently, the 
rates of gain in recent years cannot be compared directly with those of 
earlier years. They should be reduced by some arbitrary coefficient, 
possibly as much as one-third, before they can be used to draw conclu- 
sions concerning the growth of horsepower in this country. 

The methods which might be adopted to present a uniform record of 
the various types of prime movers are not simple. The engines and 
turbines in the establishment might be put on a basis comparable to 
the purchased-power data if their records were increased so as to 
present the rated capacity which would be required if all machinery 
were operated at once. Or the purchased power data could be made 
comparable with owned power if, instead of the rated capacity of the 
motors in the establishment, the rated capacity were given of the 
power plants from which the energy is drawn. But here the allow- 
ances necessary for various other forms of consumption of electricity 
generated at the same power plant would make the problem well-nigh 
insoluble. Perhaps the wisest solution, and certainly the easiest, is to 
resist the temptation to add the two types of horsepower together, and 
present them as separate data with avowedly incommensurable units, 
as is done in Table V. 

In summary, the evidence presented concerning the nature of horse- 
power data indicates that in recent years, despite certain offsetting 
factors, the increase has been magnified by the fact that it has come 
entirely in the form of rented power. Bearing this fact in mind, an 
examination of the records of increase in primary horsepower in the 
United States indicates that, although the growth is substantial, it is 
by no means as rapid at the present time as it was in earlier periods. 












American Statistical Association [38 


A SIMPLIFIED METHOD OF GRAPHIC CURVILINEAR 
CORRELATION 


By L. H. Bean, Bureau of Agricultural Economics 


The practicing statistician and economist who is frequently called 
upon to determine the quantitative relationships between two or more 
factors often finds it inconvenient or undesirable to apply the formal 
technique of multiple curvilinear correlation, as described in this 
JOURNAL.' Time and clerical help are often lacking, or insufficient 
data do not warrant the use of the formal technique. It is the writer’s 
experience, shared undoubtedly by others who have seriously attempted 
analysis of time series or other problems, that it is possible to resort to 
simplified methods of multiple correlation requiring little time or labor 
and yielding results of considerable practical value. 

The purpose of this paper is to present a simplified approach to 
multiple curvilinear correlation, based almost entirely on the use of 
graphics. The method is essentially a short cut in arriving directly at 
approximations to the ‘‘net’”’ curves without the use of the usual, time- 
consuming mathematical procedure. The formal method, it will be 
recalled, involves (1) multiple linear correlations to determine a first 
approximation to the net effect of each independent factor on the 
dependent one, (2) computation of residuals or differences between the 
values of the independent variable and values estimated from the linear 
regressions, (3) plotting the residuals as deviations from each of the 
linear regression curves, and then (4) the reduction of the residuals to a 
minimum by a process of successive approximations which involves the 
free hand drawing of curvilinear regression lines. In the approach il- 
lustrated here, steps (1), (2), and (3) are not used. In their stead 
simple scatter diagrams are used, first approximations to net curves 
drawn free hand by inspection, and residuals usually reduced to a 
minimum by substracting first the effect of one independent variable 
and then of another. The first approximation curves are then ad- 
justed, if necessary, with reference to the residual variations until no 
further changes are indicated. The elimination of the first three steps 
reduces the time involved in an ordinary correlation problem to about 
one-fourth. 

1 This JourNAL, Vol. XVIII, No. 144, ““A Method of Handling Multiple Correlation Problems,”’ by 


H. R. Tolley and M. J. B. Ezekiel; and Vol. XIX, No. 148, “‘A Method of Handling Curvilinear Cor- 
relation for any Number of Variables,’’ by M. J. B. Ezekiel. 


















39] A Simplified Method of Graphic Curvilinear Correlation 387 





To illustrate this simplified method we shall first make use of a gen- 
eral problem typical of cases encountered in actual practice, and then 
present the application of the method to two actual cases. 


I 

The first problem which will be presented in detail is one of 30 obser- 
vations, of four variables, in which the variables X2, X;, X,, correlate 
almost perfectly with X. The data appearin Table I. The objective 
















TABLE I 
DATA USED IN CHARTS [I-III 




























































































— Readings from curves in 
| Raw data | Chart II | | 
Item | | Item 
number | | number 
Sum | 
}X2} Xs} Xel x} @ | @) | (3) | ey | 
| \A 1) 
ae _ es 
- 111/10] 9] 14] 16.2] -2.5 | —,-3 | 13.4 | +0.6 | 1 
* | 20} 19] 15 | 24 | 24.9] +0.2] —1.3 |] 23.8 | +0.2 | 2 
oe | 6] 6] Of} 4] 11.3] —8.1 | 41.2 1.4] —0.4 3 
4.. | 6 | 12 6 8] 11.3] -—2.8] —0.6 7.9 | +0.1 4 
ec 8 8 | 26 16 13.3 +2.5 +0.2 16.0 0 5 
6.. | 9 8 8 | 12 | 14.2] —2.1 | +0.2] 12.3] —0.3 6 
» | 11 8 | 8/13] 16.1 | —2.1] 40.2] 14.2] —1.2 7 
“* | 14] 16] 16] 18] 19.0] +0.8| —1.1] 18.7 | —0.7 8 
9.. 112/10] O}] 9] 17.0] —8.1 | —0.3|] 8.6] +0.4 | 9 
10.. | 8/ 8 8 | 11] 13.3 | —2.1 | +0.1] 11.3] —0.3 10 
ee 4 5 | 10 11 9.4 —1.3 +2.4 10.5 +0.5 il 
 - 23 | 26 | 26 | 28 | 27.7 +2.5 —1.6 | 28.6 —0.6 12 
ee 14 | 12] 10] 17] 19.6 | —1.3 | —0.6 | 17.1 | —0.1 13 
14.. 10 | 16 | 14] 14] 15.2 0}; —1.1] 14.1] -0.1 14 
15 10 | 10; 15 | 15 | 15.2] + .2 —0.4 | 15.0 0 5 
16 20 | 13 | 20 | 26 | 24.9] +1.3] —0.7] 25.5] 40.5 16 
17 | 12] 12| 12] 16/17.0| — .7| —0.6| 15.7! 40.3 17 
18. | 10 2 8 | 21] 15.2 -2.0 | +7.4 | 20.6 +0.4 18 
19. 116] 6] 5] 19] 21.0] —3.2] +1.2] 19.0 0 19 
20 20 | 20 | 30 | 27 | 24.9 | +3.1] —1.4 | 26.6 | +0 20 
21 | 10 | 10 | 10 | 13 | 15.2 | -—1.3 | —0.3 | 13.6 | —0.6 2 
— aeeReeREEE 2} 8| 6] 5| 7.5] -2.:8| 40.1] 4.8] 40.2] 22 
eee inlnetebeteiee 1 8| 8| s8]11] 13.3] -—2.1] +01 | 11.3] —0.3 23 
24 } 12} 10} 11] 16] 17.0) — .9| —0.4/ 15.7/ +0.3] 24 
25 |} 13 7/12/18] 18.0] — .7] +0.6 | 17.9 | +0.1] 25 
26 15 9 7} 17 | 20.0) —2.5| -0.1| 17.4] —0.4| 26 
27 24 | 28 | 18 | 28 | 28.6 | +1.1 | —1.8 | 27.9] +0.1 27 
28 10 | 10 | 30 | 18 | 15.2 | +3.1 |] —0.3] 18.0 0} 28 
29 4 | 10| 9| 8| 9.4] -1.7] -0.3| 7.4] +0.6 | 29 
30. 8 6 | 10 | 13 | 13.3 —1.3 | +1.2 | 13.2 | —0.2| 30 
| | 











Standard deviation............... ; 

















of this illustration is to obtain directly, by inspection, without the usual 
preliminary mathematical computations, the net relationships between 
each of the three independent variables X 2, X; and X,4, and the depend- 
ent X, and then to compare the results with the true relationships and 
with those secured by the usual mathematical procedure. 

The steps involved in this particular illustration, outlined and 
described in detail below, are not at all complicated, but it may help in 
following them if it be observed that they center around one important 
consideration, namely, that the nature of the net relation between the 




















388 American Statistical Association [40 


dependent variable X, and any one of the independents, say X2, can be 
detected in a few observations at a time in which the relation between 
X, and each of the other variables, X; and X,, appears to be constant, 
and that by selecting observations in which the values of X, are equal, 
it is possible to detect the relation of the other variable, X3, to the 
variations in X, not already attributed to X>. 

The steps are as follows: 

1. Plot three scatter diagrams, X, with X., X, with X; and X, 
with X, to determine by inspection, if possible, which of the 
three independent variables is the most important in the 
variations in X. 

2. Determine by inspection a first approximation to the net 
relation between X, and Xo. 

3. Determine by inspection the first approximation to the net 
relation between X, (or X;) and the residuals from the first 
X, X_ approximation. 

4, Plot the residuals from the curve established in 3 against X; (or 
X,) and determine the relation of X; (or X,4) to these final 
residual values of X;. 

5. Plot the residuals from the curve established in 4 as deviations 
from the other two first approximation curves and make 
second approximations, where necessary to reduce the re- 
siduals still further. 

(1) By plotting in the form of scatter diagrams X; and X., X, and 
X;, X; and X,, it appears that there is a positive correlation between 
X, and each of the three independent variables, but the correlation 
between X, and X;, is greater than between X, and either X; or X,. 







































































































CHART I 
32 DEVIATIONS 
_—— T ] | | FROM | 
wl] | | | (2) a | | 
28 Ea ++ : a | ‘st . 7 7 a 4 
/st approx a wale eke so | 6 ——a~ 
| | = ° -arl- et 
nA: os he te el Na 0 
| ' ™ 2.86 wo 2 mans "2h 
a | y ? 6 
Sf + 
a ma 20,29 
28 ge | 
r. 26 +, 
x' In : = 
| \K. A 0 ros 8 12 16 20 24 28 32 
fo 7% | Xe DEVIATIONS 
tft : FROM2 
Lic | | EY 
Siane ae 4 
| | R 
} 
— a 0 
| | 2 tz? 
j | | 
| | i & fe 




















4] 


W 
of 











41] A Simplified Method of Graphic Curvilinear Correlation 389 


We therefore proceed with the X, X, scatter diagram to find the nature 
of the relation between X, and Xo. 

The X, X2 diagram is shown in Chart I, Section 1, to be visualized at 
this point without the approximation lines. The X, X; and X,; X, 
scatter diagrams not being necessary hereafter are not presented, but 
the variables X; and X, are shown in Chart III, where they have been 
plotted consecutively for ready reference. It should be noted that it 
is essential in this simplified procedure to maintain the identity of the 
individual observations. 

(2) The second step is to draw a first approximation of the net rela- 
tion in X, Xo. Is that relation between X, and X, (with X; and X, 
constant) positive as indicated in the diagram, or negative? Is it 
linear or curvilinear? To answer these questions, we make use of the 
fact that if the relation between X, and X; and X, and X, could be held 
constant simultaneously for a group of two or more observations, the 
comparable observations in X, X_ would lie along a line either linear or 
curvilinear which would indicate the true regression for X, X_ for those 
two or more observations only. Now X; and X, in any two or more 
observations would bear a constant relation to X, under either of these 
two conditions, (a) if the X; values and the X, values were all equal, or 
(b) if the X; values were equal, and the X, values were equal. We 
therefore proceed to inspect the actual values of X; and X, for such com- 
binations (see Chart III) and note first that the observations numbered 
6, 7, 10, 23, and 26 show equal values for both X; and X,. We next find 
the comparable observations, 6, 7, 10, 23, and 26 in the X, X.2 scatter 
diagram and note that they appear to lie along a straight line, which is 
tentatively drawn in. Further inspection of the X; and X, variables 
reveals that in observations 1 and 29 and 2, 8 and 14, they have ap- 
proximately the same values. We find and connect the comparable 
observations in the diagram X; X2. We also note that in observations 
5 and 28, X; has low but nearly equal values and X, has high but nearly 
equal values. As before we find and connect the 5th and 28th observa- 
tions in the X, X2 diagram. From the fact that the several lines so 
drawn and distributed through the diagram are nearly parallel, it is 
evident that the true regression of X, X2 is probably a straight line of 
the slope indicated by the parallel lines. A first approximation may 
now be drawn in free hand, and taken as the tentative measure of the 
relation between X, and Xo, to be modified later if necessary. 

(3) The third step involves the assumption that if the first approxi- 
mation in the X, X_ diagram shows the relation between X,. and X, the 
residual portions of X, (the vertical deviations from the curve) must be 
related to the other variables X; and X,. We therefore proceed to plot 
















































390 American Statistical Association [42 

the X, residuals or deviations from the X, X_2 tentative regression the 
against either X; or X4. cul 
In measuring and transferring these and subsequent residuals or of 
deviations, it is only necessary to measure visually the value (or dis- Sil 
tance) between each individual observation and the approximation col 
curve, and to plot these positive or negative values opposite the cor- pa 
responding values of the independent variable. In Chart II, sections the 
2 and 3, they are plotted above or below horizontal zero lines. In of 
subsequent transfers the residuals are plotted above or below the ob 
approximation curves. of 
It is immaterial which of the two independent factors are used first, | 
but for convenience we may choose the one which appears to have the Xi 
greatest influence on the X, residuals. As an aid in this choice we note tio 
that the greatest negative deviations from the X, X2 regressions, such Th 
as numbers 3 and 9, are associated with very small values of X, for ate 
those observations, and of the greatest positive deviations 5, 20, and 28 the 
are associated with very large values of X,. These facts suggest that no 
X, may be the dominant factor in determining the positive and nega- val 
tive residuals. They also suggest that the relation to be expected | 
between X, and the residual values of X, is of a positive character. Se 
Incidentally this method of inspection also at this point throws some by 
light on the nature of the relation of X; to the residual values of X;. act 
For example, we note that number 18 among the observations in X; X, | 
is well above the X , X» regression but instead of being associated with a ne 
high value for X, as in the other instances of large positive residuals it X; 
is associated with a low value of X;. This suggests that the relation de 
between X; and X, may be of a negative sort (at least for low values of tio 
X;). ext 
By plotting the vertical deviations from the X, Xe regression against Su 

X 4, as is done in Chart I, Section 2, we obtain a scatter diagram in which inc 
we desire to discover the nature of the relation of X, to the residual res 
values of X, (that is, to X, from which the effects of X2 have already 
been removed). Inasmuch as these residual values in Section 2 are the 
related to X, and X3, we may proceed to find the relation of X 4 to the cec 
X, residuals by selecting those observations in which X; values are tio 
equal. In the observations numbered 1, 9, 15, 24, 28, 29, the X; values Sec 
are equal. Connecting the comparable observations in Section 2 we fer 
obtain a curve of a positive slope, which appears to fit the scatter very ual 
well. sha 
The adequacy of this curve may now be checked by selecting con- plo 
stant values of X; for large values of X, and also for low values. We age 
note that in observations 2 and 20, X; has equal values. Connecting (in 














43] A Simplified Method of Graphic Curvilinear Correlation 391 


the two corresponding points in Section 2, we obtain a portion of a 
curve which is approximately parallel to the curve (for the high values 
of X,) already drawn in through observations 1, 9, 15, 24, 28, 29. 
Similarly the X; values in 19 and 11 are practically equal, and a line 
connecting the comparable points in Section 2 are approximately 
parallel to the first curve (for the low values of X4). If now we take 
the first curve (drawn through 1, 9, ——29) as the first approximation 
of the net relation of X, to the X, residuals, we note that many of the 
observations in Section 2 do not lie on that curve, presumably because 
of the influence of X3. 

(4) The fourth step, to determine the nature of the relation between 
X,and X;, is to plot the X, residuals from the tentative curve in Sec- 
tion 2 against the comparable values for X3, as shown in Section 3. 
The nature of the relation of X; to the residual values of X, is immedi- 
ately evident. Instead of the positive gross relationship indicated by 
the scatter diagram of X, X; made at the beginning of the analysis, we 
now find a negative net regression, particularly pronounced for low 
values of X3. 

It is evident from the relatively narrow scatter of the observations in 
Section 3 about the first approximation curve drawn through them that 
by means of the three net regression curves developed so far we have 
accounted for nearly all of the variations in X,. 

(5) The final step is to check the first approximation curves for any 
necessary adjustments which may reduce the residuals about the X, 
X; curve in Section 3 still more. At this point, if desired, the standard 
deviations from the X, X; curve in section 3 and the standard devia- 
tions of the original values of X; may be computed to determine the 
extent to which the three net curves account for the variations in X. 
Substituting these standard deviations in the usual formula for P, an 
index of correlation of .997 is indicated, the standard deviation of the 
residuals being .46. 

To complete the analysis by making a final test of the adequacy of 
the first approximation net regression curves, we follow the usual pro- 
cedure of plotting the final residuals, measured from the curve in Sec- 
tion 3 against the original values of Xe, as deviations from the curve in 
Section 1. This step is shown in Chart II, to which have been trans- 
ferred the first approximations from Chart I. The scatter of the resid- 
uals about the X, X2 curve indicates that no material adjustment in the 
shape or slope of that curve is necessary. The residuals are next 
plotted as deviations from the first approximation curve in Section 2, 
against the original values of X,. Here the scatter about the X, curve 
(in Section 2), does indicate that a slight raising of the first approxima- 




























American Statistical Association 






























































CHART II 
|| | feroeharons ream [earring | ™ 
\ u \ as 
al xx, ORM | hy eer | , 
T fl se P 
as | | | 
= _ ~— ie + ‘ i= 4 
4 | | | } | 
4 P yust approximation } | } 
~ j 3 | | —— _ 8 
| | | ; | | | 
— — = 
| O + 8 ‘2 6 20 2% 28 32 
| Xs 
[Rp amarerentz oa 7 * 
| (3) DEVIATIONS FROM XX, OR(!) | 
| pop fl lt 
——4 } ++ — 4 
IX 0 | | 
| » 2s , ., 4 . ° 71st approximetion | 
— Po itigeee r 0 
| a 3 oct 27 | 
} | ‘2nd approximotion | 
ee, Ps ne cae 4 














0 4 8 I2 6 20 20 4% 68 2 #6 20 2 2 32 
X, Ms 

tion curve for the higher values of X,4 as well as for the very low ones 
would reduce some of the residuals still more. This adjustment, con- 
stituting the second approximation for the X, X,4 curve is shown by the 
solid line in Section 2. Had the scatter been wider about this curve it 
probably would have been desirable to follow the usual procedure of 
averaging or grouping the deviations according to the values of X, in 
order to determine more exactly the shape of the second approximation 
curve. The scatter about this second approximation X, X,4 curve now 
indicates how much of the variations in X, have not been accounted for 
by the three curves, the first approximation of X, Xe, the second ap- 
proximation of X,; X, and the first approximation of X, X3. 

The reduced residuals, that is, the deviations about the second X,; X, 
curve are next plotted as deviations about the first approximation 
X, X; curve in Section 3 (Chart II), to test the adequacy of the latter. 
The resulting scatter indicates the desirability of lowering somewhat 
the first approximation X; curve. This adjusted curve now becomes 
the second approximation X; curve. 

The extent to which these two adjustments have reduced the first set 
of residuals may now be seen either in the smaller deviations about the 
second approximation X, X; curve than those about the first X; Xs 
curve or by computing values for X,; from the three final net regressions. 
The computed values are obtained simply by reading for each of the 
variables X2, X3, and X4, the corresponding X, values from the final 
curves in Chart II, and adding them algebraically. The readings from 
these curves, and the differences between them and the actual values of 
X, are given in Table I. The standard deviation of these differences, 
or final residuals in Section 3 is .411, and the index of correlation is .998 








(co 


firs 


ap] 
of 

the 
bee 
cor 


ap] 
cor 


are 
ha’ 
‘“ \ 


fro: 


low 


the 
Sec 
Thi 


of 2 











45] A Simplified Method of Graphic Curvilinear Correlation 393 


(compared with .46 and .997, respectively, for the readings from the 
first approximations). 

At this point it may be observed that the arbitrary placing of the 
approximation curves without reference to the average values of X; and 
of the other variables does not affect the values of X,; computed from 
the curves. For example, had the approximation curve in Section 1 
been placed higher, the residuals in Sections 2 and 3 would have been 
correspondingly decreased, and the curves lowered. 

In order to determine whether the results obtained by the simplified 
approach to curvilinear correlation are reasonably accurate, we may 
compare them with the true curves and with the approximations that 
are obtainable by the usual method. To facilitate this comparison we 
have used in this general illustration the data given by Ezekiel in his 
“Method of Handling Curvilinear Correlations” ! which are derived 


20 _ 
—+44/X,—5. The true net curves for 


4143 





from the formula X,;= X2+ 


X, Xo, X; X3; and X, X4, according to this formula are indicated in the 
lower half of Chart III and are compared with the curves obtained by 












































CHART III 
VALUES OF X3AND X, 
40 40 
‘ ae ii 
Hi Xg~+/\7 r 
FX yy, v 5 2h 20 
¢ ‘ ‘ ‘ ‘ 
CG | y -_, ‘ ‘ 
< _— = ) —f ft 10 
i. ' J. 0 
15 20 25 30 





OBSERVATIONS 
APPROXIMATION CURVES COMPARED WITH THE TRUE CURVES 
































—True -°-Approximation 

T T xy 

(3)RELATION OF X3TOX, 
| | 5 
0 
a | -5 

“ (2) RELATION OF X,TOX, 
0 ; L -10 

0 10 20 0 10 20 30 
Xz X3 AND X, 


the simplified method of correlation. The approximation curve in 
Section 1 has a slope only slightly different from that of the true curve. 
The approximation curve in Section 2 does not rise as much for values 
of X, between 0 and 5, as does the true curve, but this is due to the fact 


1 This Journa, December, 1924. 




















American Statistical Association [46 


394 





that the data used in this problem contain no values for X, between 0 
and 5. The approximation in Section 3 also differs only slightly from 
the true curve. 

The close agreement between the true curves and the approximations 
may be compared with the results obtained by the Ezekiel Method, by 
referring to this JouRNAL, page 446 of the December, 1924, issue. It 
will be observed that the approximation curves there derived show in 
general practically the same agreement with the true curves. For the 
curves X; X, and X,4 X, the agreement is somewhat closer as developed 
here (in Chart III). 

The standard method gave a correlation index of .994 after three suc- 
cessive approximations which may be compared with the correlation 
index of .998 after only one adjustment as indicated above. The time 
required to make the correlation analysis by the usual method by an 
experienced person is about eight hours. The steps outlined here take 
less than two hours even when the nature of the curves is not known. 

The question may be raised whether the simple approach can be con- 
veniently used in problems involving more observations and more 
variables than those in the illustration. Inasmuch as the facility of 
this method depends on detecting approximate net regressions by in- 
spection instead of by mathematical computation of linear regression 
and successive approximations, too many observations in a scatter dia- 
gram are likely to make it difficult to find the true relationships. But 
this limitation can be overcome by dividing the problem into two or 
more sections and treating each separately. Where the observations 
are numerous and variables exceed four or five, it should be possible to 


TABLE II 
DATA USED IN CHARTS IV AND V 


























| Chart IV | Chart V 
| | 
Year Price per May price — Index of 
| Production! | bushel re- | per ewt. of ay of Price per | production 
- _ ceived by old potatoes | ee .- lof manufac- 
producers Chicago sumption * | cotton * | tures? 
| | 
Million 
| bushels | Dollars | Dollars | Percent Cents Per cent 
1919...... | Be oe i ae 220 | 84 
SOO. <0: | 95 24.7 86 
Seas tas } 212 | 212 | 87 88 14.5 66 
aie | 24.1 | 1.34 | 1.70 99 18.2 87 
Seer 18.7 | 1.67 1.13 106 25.3 | 101 
192! re 29.4 99 1.50 | 89 30.6 O4 
1925...... | 204 | Lal | 1.13 105 24 6 | 105 
eer ; 23.7 } 1.72 3.23 | 109 19.7 108 
ee | 29.6 | 1.55 | 3.51 | 122 15.3 | 106 
— aeneeeateat 37.4 | 65 | 1.48 +| 107 20.4 | 











1 Potato production 10 early states. 
2 1923-25 =100, Federal Reserve Board. —_ 
3 At New Orleans, crop year ending in the indicated calendar year. 











lat 
wl 


gr 
th 
19 


pri 
old 
26 
tio! 
pot 
line 
plo 
Ac 
pot 


p 


‘e 








47] A Simplified Method of Graphic Curvilinear Correlation 395 


croup the data according to equal values of some of the variables, and 
to analyze each group separately. 


II 


The application of the simple approach to multiple curvilinear corre- 
lation to actual problems is illustrated in Charts IV and V, the data for 
which are contained in Table II. 

In Chart IV is an analysis of the price of early potatoes received by 
growers, showing the effect of the production of early potatoes and of 
the competitive effect of the price of old potatoes for the years 1921-— 
1928 inclusive. Section 1 contains a scatter diagram of production and 


CHART IV 


THE EFFECT OF EARLY POTATO PRODUCTION AND THE PRICE OF OLD POTATOES 
ON THE GROWERS rae C OF F EARLY POTATOES 


PRICE —_ = PRICE OfVv \ATIONS 
t ee Glee T 7] +DOLLARS | 






. oo 
May price of old potatoes 
yor Chicag 


| Devotions from 


| EFFECT OF PRODUCTION OF EARLY POTATOES 
weed n jt — & +50 


ail | 
— am 3 

} 

| 













2 











———EE— EE i 

















50 +____ 
1921 1923 1925 1927 1929 








50 


COMMERCIAL PRODUCTION: MILLION BUSHELS PROOUCERS EARLY POTATOES 


ar 


| 
Ss soe a Se | a 
. 
25 200 225 250 275 300. 325 35.0 ag 


PRICE ‘ica one i 75 
DEVIATIONS } j . 
ONS (2) "EFFECT OF PRICE OF | 
pekeaiesk OLD POTATOES | 
+50 mae j 
| | * "26 


ul 
a = | os 




















| 
a 1.50 + —F 
| 




















+25 1.25 + —+ | 4 
} 
| 
0 nn cot foo Goeen oot Gene | 
ZA | i | | 
-25 "I | 75 4 
‘yd | | he 




















-50 





50 
5O L400 150 200 250 300 350 400 450 is2t 1923) «1392S s1S27~—s 1929 
MAY PRICE OF OLD POTATOES AT CHICAGO 


price. Section 3 shows the variations in the third variable, the price of 
old potatoes. The dotted lines in Section 1 connecting observations 
'26 and ’27, ’23 and ’25, and ’22 and ’24, represent the effect of produc- 
tion in years of approximately high, average, and low prices of old 
potatoes. They suggest the approximation curve shown in the solid 
line. Deviations from the approximation curve in Section 1 are 
plotted in Section 2 against the corresponding prices of old potatoes. 
A curve drawn through them represents the influence of the price of old 
potatoes, with the effect of production of early potatoes held constant. 
The algebraic sum of the readings from the curves in Section 1 and 
Section 2 are compared with the actual prices in Section 4. 

In Chart V is an analysis of the effect of price and business activity 








































396 American Statistical Association [48 


on the domestic mill consumption of cotton for the years 1919 to 1928, 
inclusive. Section 1 contains a scatter diagram, with the Federal 
Reserve Board index of cotton consumption on a calendar year basis 
plotted against the yearly average price of cotton for the crop of the 
preceding year (approximating a six-month lag). Section 3 contains 


CHART V 


THE EFFECT OF COTTON PRICES AND BUSINESS ACTIVITY ON 
DOMESTIC COTTON ‘CONSUMPTION 








































































































COTTON T T T T 120 
CONSUMPTION |"? (3) iwoex OF BUSINESS ACTIVITY, MANUFACTURES 
EFFECT OF PRICE 1923-1925 = 100) | 
130 1923-1925 =100 | 110 
120 100 
110 90 
100 80 
90 70 
80 —— 60 
14 18 22 26 30 34 1919 1921 1923 1925 1927 1929 
PRICE CROP YEAR (ENDING ) 
DEVIATIONS, COTTON CONSUMPTION FROM PRICE 
CONSUMPTION T T 7 130 
CURVE (2) EFFECT OF CHANGES! I £2 (*) | | 
| IN BUSINESS ACTIVITY i 2 COTTON CONSUMPTION IN THE us | 
0 “s 1923-1925 =100 -. ae Bi 
| Esrimors 
-10 - 110 
} A mes = | 
| } | 4 | “ 
20 =-— ? 100 
| PN IF | 
-30 ‘me | = . an we oe + meen 
| = = | 
-40 Oe : “= ee 80 
-40 -30 -20 -10 0 +10 1919 1921 1923 1925 1927 1929 


DEVIATIONS, BUSINESS ACTIVITY FROM TREND 


the Federal Reserve Board index of manufactures used here as a meas- 
ure of business activity. An arbitrary trend line has been drawn 
through this series, as suggested by the dotted lines connecting the 
indexes for 1923, 1926 and 1927, and for 1919, 1920, 1922 and 1924. 
Since the indexes of business activity for 1923, 1926 and 1927 are ap- 
proximately of the same value in relation to the trend, the comparable 
observations in the price-consumption diagram may be connected to 
obtain a suggestion of the influence of price for these three years of ac- 
tive business. Similarly, for the years of activity below the trend, 
1919, 1920, 1922 and 1924. The dotted lines in Section 1 suggest the 
approximation curve shown by the solid line. Deviations from this 
first approximation are next related to the index of business activity, 
but since in drawing the first approximation in Section 1 deviations 
from trend were considered, it is necessary to use the deviations from 








ak 
th 
tit 
sh 
po 


lov 
lin 
be 


vil 
cu 
th: 
tio 
th: 
str 
tio 
to 

vic 
ani 
pre 
vari 


part 
Agr 





49] A Simplified Method of Graphic Curvilinear Correlation 397 


trend in Section 2. The observations in the scatter diagram fall on an 
obvious straight line, which is drawn in to represent the relation of 
business activity to the consumption of cotton. The elimination of 
trend in this problem is equivalent to holding constant the influence of 
those factors which vary uniformly with time. 

Second approximations are not shown here because the residuals 
about the curves in Section 2 of Charts IV and V are small. Usually 
the curves even in cases such as these where practically all of the varia- 
tions in the dependent are accounted for by the first approximations 
should be checked by plotting about them the final residuals. On this 
point note that a slight lifting of the curve in Chart V, Section 1 at 
prices of 18-19 cents would place the observation ’22 on the line, and 
lowering it for prices around 30 cents would raise observation ’24 to the 
line, thus reducing still more the residuals in Section 2 or the differences 
between the actual and computed values in Section 4. 


III 


The three illustrations of the simplified approach to multiple cur- 
vilinear correlation are intended primarily to demonstrate how net 
curves may be arrived at expeditiously by making use of the criterion 
that net relationships, whether in a large or small number of observa- 
tions, require holding constant the influences of the other variables 
that are being considered in the analysis. They should also demon- 
strate that no fixed device is here suggested for selecting the observa- 
tions in which one or more of the dependent variables may be considered 
to have an equal or constant relation to the dependent variable. De- 
vices other than the three illustrated here will suggest themselves to the 
analyst if the theory underlying multiple curvilinear correlation and the 
problem in hand are properly understood.! 

1 For other devices, see the seven illustrations, together with the three presented here, dealing with 
variations in prices, acreage, yield and livestock numbers as described in a preliminary report, in two 


parts, on ‘Applications of a Simplified Method of Curvilinear Correlation,’’ issued by the Bureau of 
Agricultural Economics, U. 8S. Department of Agriculture. 













































American Statistical Association 


THE NEED FOR AN INDEX FOR SOCIAL DATA 


By Mary JOHNSTON 


In recent years there has been so vast an amount of fact gathering in 
the field of the social sciences that it is increasingly important to have 
the various bodies of data classified and made readily available for the 
use of research workers. 

Primarily to meet its own needs in this respect, the Institute of 
Social and Religious Research, early in 1929, undertook a study of the 
whole problem of indexing certain of these data, commencing with the 
most important and valuable of all available source data for social and 
economic research—namely, the returns of the Bureau of the Census. 

Some of the results of this experimental study, so far as it concerns 
the Census, are reported in the following pages. They seem at least 
to raise the question of whether the Government itself might not 
consider the advisability of revising its present form of indexing the 
vast bulk of population statistics so as to make the important data 
contained in them more readily accessible to research workers. 

As an example of what might be done with most bodies of published 
governmental statistics the following pages present a suggested index 
of certain population data for the years 1920, 1910 and 1900 for cities 
having 100,000 inhabitants or more. 

This particular sample was chosen because it illustrates rather 
clearly under four heads the advantages of this method of indexing 
census materials. 

First, even the simplest and most fundamental population data are 
scattered throughout all census volumes pertaining to population. 
Tables I to V of the suggested index here presented give references 
only to information regarding total population, sex, age, and general 
nativity; and yet if the eye runs down the page references listed in these 
tables, references to nine different census volumes are found. In fact, 
only one population volume, that on occupations, 1910, is not referred to. 

Second, although data are classified under subject headings, not 
all the available information is found under a given heading. Suppose 
we are studying the recreational needs of Newark, New Jersey. To 
do this intelligently we should have the age data in quite small group- 
ings for the children under 20 years because the recreational activities 
of young people change so much from year to year. 

If we turn to Index Table IV,' which is an index of all population 


1To avoid confusion, the suggested tables that accompany this article are referred to in the text 
as “Index” tables. 























































51] The Need for an Index for Social Data 399 


data in which age is the primary classification, we find that for 1920 
four separate age groups (referred to as age groups A, B, C, and D) 
are available for a city like Newark. But additional age data are also 
available in other census tables in which age is not the primary classifi- 
cation. Such materials are summarized in Table V of the accompany- 
ing index. In fact, Table V shows that some of this supplementary 
material is better adapted to an analysis of the recreational activities 
of the young people of Newark than is the material directly classified 
in the census by age groups. Moreover, it is of special interest to note 
that this supplementary material is available by wards, whereas the 
census material regularly classified by age is not. 

Third, the scope of the census is broader than is generally believed. 
For instance, Index Table I contains references to the distribution of 
population in “metropolitan districts’? and to other special topics 
which few people would look for in census reports. In addition, certain 
tables similar to that showing the per cent of Negroes in the total 
population in selected cities (see Index Table III) furnish a veritable 
mine of information on special topics. 

Fourth, certain terms are defined differently in different census 
volumes. In 1900 the term “native white of foreign parentage’’ signi- 
fied all native whites with one or both parents foreign, but in the 
censuses of both 1910 and 1920 this term referred only to native whites 
with both parents foreign. In the suggested index these differences 
are clearly indicated. 

Since the index tells exactly what information is available on every 
point, it should enable a research worker to use census data with a 
minimum of waste effort. 

Suppose we are studying the growth of the city of Cleveland since 
1900 and are interested in the changes in the composition of its popula- 
tion during that period. Without an index such as is here illustrated 
we should have to find our way through the maze of census facts 
published in ten volumes, and should have to look under each major 
head in each volume to be sure that no available information about 
Cleveland was being overlooked. With the classified index suggested, 
the search for usable material would be virtually eliminated. The 
first step would be to find out from a table (such as Table 52, page 320 
in Volume I, 1920 Census) the population of Cleveland in 1900, 1910 
and 1920 in order to determine the city’s size classification at each 
census date. The data show that it had a population of 796,841 in 
1920, of 560,663 in 1910, and of 381,768 in 1900. 

Turning now to Index Table I, we find that for cities of the size of 
Cleveland, total population figures are available not only for the city 











400 American Statistical Association [52 


as a whole but also for wards and for the metropolitan district both 
in and outside the city. 

The index shows not only that data by sex and by principal classes of 
the population are available for the city as a whole and by wards, but 
also exactly what these items of information are and where they may 
be found. 

In regard to age distribution, Index Table IV shows that while four 
age groups are given for 1920, only one is given for 1900. Therefore, 
in comparing the age composition of the population for the three census 
periods, it will be necessary to use this one, which is age group B. As 
it happens, this particular grouping is given for 1900 and 1910, but not 
for 1920; therefore it will also be necessary to adjust the figures of age 
group A for 1920, so that they will be comparable with those of age 
group B. 

Index Table IV does not give any references to age data by wards for 
any city; but turning to Tabie V, in which the supplementary age data 
are listed, we find references for various age groups by wards which, 
when combined, give a fairly complete age distribution under the 
following heads: Under 7 years, 7 to 13 years, 14 and 15 years, 16 and 
17 years, 18 to 20 years, 21 years and over, and 18 to 44 years. 

Thus with the use of the index all of the available information about 
the population of Cleveland can be quickly located even though it is 
classified under various headings, seemingly unrelated, and scattered 
through seven of the ten census volumes dealing with population. 

The task of classifying governmental statistics so as to make them 
more readily available to research workers could best be undertaken 
by the governmental bureaus themselves since they naturally know 
their own materials best. If the different departments were to under- 
take such a task they could not only do all or more than it has been 
possible for an outside agency to do, but they could in addition list 
valuable materials which they tabulate but do not publish. Such 
unpublished sources are known to be large. For example, it is true 
in general that, as a routine procedure of the Census Bureau, data 
for all cities of 25,000 inhabitants and over have been tabulated in 
the same detail; but for the smaller cities in this group many of these 
items are not published at all. Nevertheless they are available in 
the files of the Bureau in Washington. This means that the detailed 
data given in published reports of the Census for the large cities can 
as a rule be obtained for the smaller cities from the Census Bureau's 
files for the merely nominal cost of copying off the figures. 

The $40,000,000 spent in the 1930 census will produce a great new 
body of tabulated data, only part of which will be published. At the 








cr 


r~ ei 





if 


y 


p 


wv 





53] The Need for an Index for Social Data 401 


relatively low cost of a classified index such as has been suggested, the 
Government could make all this material, both published and unpub- 
lished, available for use by large numbers of persons who would be 


interested if the data were made more accessible. 


ILLUSTRATIVE TABLES 


TABLE I 


DISTRIBUTION OF TOTAL POPULATION IN CITIES HAVING 100,000 INHABITANTS 


OR MORE: 1920, 1910 AND 1900 


Source 





= 
| 


| Report 


Vol. Table | Page 





Cities of 200,000 inhabitants or more. . 


Metropolitan district: 
Population 
Area in acres * 
Civil divisions 
In city proper: 
Population 
Area in acres * 
Outside city proper: 
Population 
Area in acres * 
Adjacent territory: 
Population 
Area in acres * 
Civil divisions 


Cities of 100,000 to 200,000 inhabitants: 


Total for central city and adjacent territory 


Population 
Area in acres * 
Civil divisions 
In city proper: 
Population 
Area in acres * | 
Outside city proper: 
Population 
Area in acres* 


Summary table for all central cities and adja- 
cent territory. . 








Population | 
Per cent of increase 
Area in acres * 


All individual cities having 100,000 inhabitants 

or more: 

Total population. . . 

Decennial increase in population... . 

Largest fifty cities at each census arranged in 
order of rank | 

Total population... . 

Total population by wards and assembly) 

districts. .. ie ved eas 





1920 
1910 


1920 
1910 


1920 
1910 


1920 
1920 


1920 
1920 
1910 
1900 


I 
III 
het a 





10 63 
50 | 74 


1 
= 

be 
“In 


tot 
~IsJ 
No 


to 
2 
=) 

















* Not given in 1900. 


t+ Under each state. 











































402 American Statistical Association [54 


TABLE II 
SEX DISTRIBUTION OF POPULATION IN CITIES HAVING 100,000 INHABITANTS OR 
MORE: 1920, 1910 AND 1900 








| 























Census Source 
: > | 
vues | Report Vol. Table Page 
Total population (Male, Female, Males to 100) , 1920 | 1920 Il g 117 
females)............ --=+-} 11900, 1910 | 1910 I |26&27] 278 
All classes. . . Sear ate | 1900-1920 | 
White..... BEAL ERS ia 1900-1920 | 
NS a: aio» 1900-1920 
Indian, Chinese, Japanese and all other... 1900-1920 
Native white....... 1900-1920 
Native parentage. . 1900-1926 
Foreign or mixed parentage 1900-1920 
Foreign parentage... 1920 
Mixed parentage..... ; 1920 | 
Foreign-born white........ = . 1900-1920 | 
Total population by wards or assembly districts} { 1920 | 1920 | _ III 13 t 
(Male, female)............ } ¢ 1910 | 1910 |II&III| V t 
f 1900 1900 I 23 & 24 609 
Native white........ er aoe ee 1900, 1920 
Foreign-born white... ... oe cae 1900, 1920 | 
Negro aes aa : si eee 1900, 1920 | 


+ Under each state. 


TABLE III 


COLOR OR RACE, NATIVITY AND PARENTAGE OF POPULATION IN CITIES 
HAVING 100,000 INHABITANTS OR MORE: 1920, 1910 AND 1900 












































| | Source 
} Census | 
| years | ] | 
Report Vol. | Table | Page 
| 
Total population (Number, per cent and decen-| 
nial increase)...... ..| 1900-1920 | 1920 II 16 | 50 
BIE, 6. ow -ccecovaser ‘ 1900—1920 
White. <eanenwaken ; cae 1900—1920 
Negro........ | 1900-1920 
Indian, Chinese, Japanese and all other... . .| 1900-1920 
Indian, number only eaves 1900-1920 1920 II 18 75 
ee lame : : ; 1900-1920 1920 IT 18 75 
Japanese, “ * er eee err i 1900-1920 1920 II 18 75 
All other, “ a eral ce tesieecne 1920 1920 II 13 47 
Native \ f 1920 1920 II 13 47 
Foreign born ee ee ee ee ee ; 1910 1910 I 37 178 
1900 1900 I 23 609 
I a a rs at * 1900-1920 1920 II 16 50 
Native parentage......... Space Sareea 1900-1920 
Foreign parentage....... caw ; ; 1900-1920 
Mixed parentage....... area Saas 1900-1920 
I 5 og x00 se een we woe awe 1900-1920 1920 II 16 50 
f 1920 1920 III 13 Tt 
Total population by wards or assembly districts.| { 1910 1910 II & Ill Vv tT 
1900 1900 I 23 & 24 609 
All classes 
Native white: 
Native parentage 
Foreign parentage 
Mixed parentage * | 
Foreign-born white } 
Negro | | 
Indian, Chinese, Japanese and all other | 
Special Table: 
Negro population and per cent Negro i in tots | | | 
population for 181 selected cities ft. 1920 } 1920 | II | 19 77 
——— similar tables in other census {1910 | 1910 I | 50 | 230 
FEPOFtS. 2. cece ccc cceererecesees ~~ | 1900 | 1900 I | LIX | CXXII 
| | - 
* Combined in 1910 end 1900. ¢ Under each state. 


t The cities covered by this table are those in which the total population in 1920 was at least 10,000, 
and i in which the Negro population was at least 5,000, or was at least 10 per cent of the total population. 








55 


AG 


To 











55] 


The Need for an Index for Social Data 


TABLE IV 
AGE DISTRIBUTION OF sek oe! IN CITIES pat ING 100,000 INHABITANTS OR 










403 

















Total population: 
All classes: 





Age group B.. 
Age group C 
Age group D 
Native white: 
Age group B 
Native parentage: 


Age groupA.... 
Age group B. 


1910, 1920 
1900 





MORE: 1920, 1910 AND 1 
Source 
Census 
years 
Report Vol. | Table Page 
| e 
Age group A and single years under 5, for 
cities having 100,000 to 500,000 inhabit- f 1920 1920 II 16 305 
Si an Raa ea ww ae 4 ek oe eae | 1910 1910 I 50 450 
Age group A and single years under 25 for} { 1920 1920 II 15 288 
cities having 500,000 inhabitants or more| | 1910 1910 I 49 437 
sf 1910 1910 I 53 472 
een ey, ene ay | 1900 1900 II 9 122 
Eas ctatila iain dais Slee aaa 1920 1920 III 8 t 
f 1920 1920 II 17 | 362 
re art ae ae \ 1910 1910 I 51 | 464 
Age group C..... 1920 
asin 1910 
| 
| 
| 


Age group D. 
Foreign or mixed parentage 
Age group A . 
Age group B. 
Age group D 
Foreign parentage: 
Age group A.. 
Age group D 
Mixed parentage: 
Age groupA..... 
Age group D.... 
Foreign-born white: 
Age group A.... 
Age group B.. 
Age group C.. 
Age group D 
Negro: 
Age group A.. 
Age group B.. 
Age group C... 
Age group D 
Indian, Chinese, Japanese and all other: 
Age group A 





1910, 1920 


1910 
1900 * 
1910 


1920 
1920 


1920 
1920 


1910, 1920 
1900, 1910 
1920 
1910, 1920 
1910, 1920 
1900, 1910 
1920 
1910, 1920 


1910, 1920 























parents foreign. Under each state. 


KEY TO TABLE 


Age group A Age er group B 


- 


| 


IV 


Age group C 


* Native white-Foreign parentage is the term used in 1900 to signify native white with one or both 


Age group D 








(Male, female) 
All ages: 
Under 5 years * 
Under 1 year 


(Male, female) 
All ages: 
Under 5 years 
l ‘nder 1 year 


5to 9 years 5to 9 years 
10 to 14 10 to 14 ' 
lsto19 “ 15to19 ‘* 
Si eatentinatate a 20 to 24 7 
‘eebhauws 25 to 34 ni 
<n 35 to 44 5 
85 to 89 years 45to64 ‘“* 
9to94 * 65 years and over 
95 to 99 Age unknown 


100 years and over 
Age unknown 





| 
| 


(Male, female) 
All ages: 
Under 5 years 
Under 1 year 


5to 9 years 
10 to 14 
15to19 “* 
20to44 “ 


45 years and over 
Age unknown 





(Sex not given) 


All ages: 
Under 5 years 
5 to 14 
5to24 “ 
25to44 “ 
45to64 “ 
65 years and over 































































* Given in two groups in 1900. 











404 


TABLE V 





American Statistical Association [56 


SPECIAL AGE GROUPS AND AGE DATA FROM TABLES IN WHICH AGE IS NOT 


THE PRIMARY CLASSIFICATION 


Source 























| Census 
JURES Report | Vol. Table Page 
Total population: 
All classes: 
Militia age (18 to 44 years): 
Male, female........... ioave ae 1920 1920 III 8 + 
ES ee Sree ae } 1900 1900 II 23 169 
Voting age: 
Male, female.......... 1920 1920 II 10 t 
SE beac Cedeentwanas 1900, 1910 1910 II & III II t 
Age group E: 
oa ae diated eee itarioae ane 1920 1920 II 16 463 
NS a eb edhe bw west 1910 1910 I 38 615 
Age group F: 
Ie cs gh EE a as ane 1920 1920 II 17 1085 
Age group G: 
Male, female.......... laa ae 1920 1920 IV 19 & 20 452 
Native white: 
Militia age....... 1900, 1920 
Native parentage: 
Voting age....... eas alah lead 1900-1920 
are oreo: 1910, 1920 
Age group F.......... bie aigra cae 1920 
Foreign or mixed posentage: 
Voting age........ 1900-1920 
Age groupE....... 1910, 1920 
Foreign parentage: 
Age groupE...... oe 1920 
ai aii ce aa 60cm 1920 
Mixed parents age: 
CS rr 1920 
Age group F... 1920 
Foreign-born white: 
SST TET 1900, 1920 
Voting age....... } 1900-1920 
Age group E.. | 1910, 1920 
Age group F.... 1920 
Negro: 
ee SEEETRELTE EEE | 1900, 1920 
Voting age....... 1900-1920 
Age group E.. 1910, 1920 | 
Age group F. 1920 
sand ov er, , male, female } | 
Chinese—“ \ | 1920 1920 II 16 163 
Japanese—* “ “ “ si 9 i | 1910 1910 I 38 615 
Allother—‘‘ “ “ “ = . }) 
Total population by wards or assembly dis-| { 1920 1920 III 13 t 
ee Lana ea sae Oe 1910 1910 | IL&III Vv t 
| | 1900 1900 II | 27 211 
Militia age........ | 1900, 1920 
Voting age....... 1900-1920 | | 


Age group F and total under 7 years. .| 





1920 | 

















t+ Under each state. 


KEY TO TABLE V 








Age group E 


Age group F | 


po group G 





(Male, female) 


16 years and over: 
15 to 19 years 
20 to 24 years 
25 to 34 years 
35 to 44 years 
45 to 54 years * 
55 to 64 years * 
65 years and over 
Age unknown 


(Male, female) 


7 to 13 years 
14 and 15 years 
16 and 17 years 
18 to 20 years 








(Male, female) 


10 years and over: 
10 to 13 years 
14 years 
15 years 
16 years 
17 years 
18 and 19 years 
20 to 24 years 
25 to 44 years 
45 to 64 years 
65 years and over 
Age unknown 











* Combined in 1910. 








57 


Di 
re] 
th 
we 
co 
wi 
lat 
TI 
30 


th 


ob 


19 
av 
12 


sh 
tic 


er 
m¢ 
ha 
be 
th 
ag 


is h 
of t! 











Determination of a Change in Crop Acreage 


DETERMINATION OF A PRECISE INDICATION OF 
CHANGE IN CROP ACREAGE! 


By A. J. BEYLEVELD, Cornell University 


Each year the Rural Mail Carriers obtain data for the United States 
Department of Agriculture on crop acreage. Just over 1,500 such 
reports have been obtained annually from Maryland farmers. From 
these reports for 1924 and 1927, samples each containing 100 farms 
were drawn at random from the total number of reports grouped by 
counties. For ten of these samples for each year the farm, corn, winter 
wheat, and clover and timothy hay acreage for each farm was tabu- 
lated. Calculations were then made for each of the 100-farm samples. 
Then these 100-farm samples were combined to form samples of 200, 
300, 400, and so on, up to 1,000 farms. 

An analysis of these reports was made to measure the fluctuations in 
the acreage of corn, winter wheat, and clover and timothy hay, and to 
obtain the most reliable indications of change ? in the crop acreage of 
Maryland. The 1,000 farms averaged 129.7+1.7 acres per farm in 
1924, and 128.3+1.8 acres in 1927. These farms, in 1927, had on an 
average, 18.1+0.3 acres of corn, 20.9+0.5 acres of winter wheat and 
12.5+0.3 acres of clover and timothy hay. (Table I.) 

The stabilizing effect of the increase in the size of the sample is 
shown by increasing the number of farms from 100 by steps of addi- 
tional samples of 100 farms until 1,000 was reached. 

The averages for farm acreage and for the acreage of each of the three 
crops, corn, winter wheat and hay, reach a degree of stability where 
more farms have little effect on the averages after about 800 farms 
have been included. (Table I.) There is practically no difference 
between the averages shown by 800 and by 1,000 farms, nor between 
the probable errors of those samples. The probable error of farm acre- 
age decreased about +0.2 acres. The probable errors of crop acreage 


' Acknowledgments are due to Mr. W. F. Callander, Chief of the Division of Crop and Livestock 
Estimates, for extending to the author the opportunity to work on this problem, and for the equipment 
and clerical assistance; to Mr. H. R. Tolley and Mr. C. F. Sarle for invaluable assistance in developing 
the mathematical formulae involved in this study; to Mr. C. F. Sarle and other members of the Bureau of 
Agricultural Economics for valuable suggestions and help in carrying out the work; to Mr. 8S. R. Newell 
and Mr. G. S. Ray for placing the rural carrier acreage data of their offices at the author's disposal. 

? The precision of the indication of change in acreage is not solely dependent upon the way the sample 
is handled after it has been obtained. It is further necessary that the reports should be representative 
of the area from which they are drawn, and that they be not subject to an excessive amount of bias that 
cannot be eliminated. 


































American Statistical Association 


TABLE I 


ACRES PER FARM AND RELATIVE PROBABLE ERRORS FOR 
DIFFERENT SIZED SAMPLES 


SEPTEMBER RvuRAL CARRIER ACREAGE SURVEY, MARYLAND, 1924 AND 1927 


















































| ar 

Number Acres per farm Relative probable errors 

| of farms 

Year | selected 
at - ” - 

—. Farm Corn | Wheat | Hay | Farm | Corn Wheat | Hay 

| | 

— — 

| | Per Per | Per Per 

| cent cent cent | cent 
1924 | 100 123.7 +5.1 19.5+1.1 23.5+1.7 10.4+0.7 4.1 S413 3.4 ifs 
1927 | 100 139.8 +6.1 17.9+0.9 | 21.1+1.5 12.0+0.8 4.3 5.0 | 7.0 7.0 
1924 100 133.6+5.9 19.8=+1.1 51.5 12.4+1.1 4.4 5.7 8.1 8.7 
1927 100 123.7 +5.2 17.7 +0.9 20.2+1.4 12.0—0.9 4.2 4.3 a0 7.3 
1924... 200 128.7+3.9 | 19.6+0.8 | 21.0+1.1 | 11.4+0.7 3.0 3.9 5.4 5.7 
1927 200 131.7 +4.0 17.8+0.6 20.7+ 1.0 12.0+0.6 3.0 | 3.5 5.0 5.0 
1924 300 128.7 + 18.8+0.6 | 20.8+0.9 | 11.740.5] 2.6 | 1} 4.3 | 4.5 
1927 300 133.5+3.4 17.7+0.5 | 20.620.9 12.4+0.5 2.5 | 2.8 | 4.1 4.3 
1924 400 129.8+2.9 18.6+0.5 19.9+0.7 11.9+0.5 2.2 2.6/ 3.7 3.8 
1927 400 134.1+2.9 17.9+0.4 | 20.8+0.8 12.9+0.5 2.2 2.5 | 3.6 3.7 
1924 | 500 131.5+2.6 18.9+0.4 | 20.8 +0.7 12.4+0.4 2.0 2.3 | 3 3.3 
1927 500 | 132.0+2.6 | 18.1+0.4 | 20.9+0.7 | 12.8+0.4| 2.0 | 2.3 | LS iss 

| 
1924 600 | 131.5+2.3 | 19.0+0.4 | 21.340.6 | 12.540.4|] 1.8 |1.9] 2.9 | 29 
1927 600 132.8+2.4 18.3+0.4 21.1+0.6 | 13.0+0.4 1.8 2.1 3.0 3.0 
1924 700 | 131.0+2.1 | 18.8+0.3| 21.2+0.6| 12.5403] 1.6 |1.8] 2.7 | 27 
1927 700 | 130.2#2.1 | 18.2+0.3 | 20.94#0.6 | 12.8204] 1.6 | 1.9] 28 | 28 
1924 800 129.7+1.9 18.6+0.3 | 20.7+0.5 12.5+0.3 1.5 7) ae 2.5 
1927... 800 128.5+2.0 17.9+0.3 | 20.7+0.5 | 12.7+0.3 1.5 1.8 2.7 2.6 
1924 900 130.0+1.8 | 18.6+0.3 | 20.820.5 12.3+0.3 1.4 1.6 2.3 2.4 
1927. | 900 | 128.1#1.9 | 18.0+0.3 | 20.8+0.5 | 12.6+0.3] 1.4 17125 |24 
| | | 

1924... 1000 129.7+1.7 18.6+0.3 | 20.6+0.5 12.2+0.3 1.3 | 1.5] 2.2 2.2 
1927 | 1000 128.3+1.8 18.1+0.3 | 20.9+0.5 12.5+0.3 | 1.4 1.6 2.4 2.3 








remained unchanged. The averages and the probable errors indicate 
that a sample would have to be increased materially beyond 1,000 
farms in order to obtain any further significant decrease in the variabil- 
ity of the averages. 

The coefficient of variability of farm acreage is just over 60; for corn 
acreage, about 73; for wheat acreage, 105; and hay acreage, 104; indi- 
cating that all these variables are subject to considerable fluctuation. 
Because of this, no satisfactory results can be obtained from samples 
that are somewhat small, and when a certain degree of stability has 
been attained by a larger sample, small increases in that sample are of 
little value in further decreasing variability. Farm acreage with the 
lowest variability had a probable error of +1.8 acres in 1927, with 
1,000 farms. About 3,100 farms would have been required to reduce 
the probable error to + 1.0 acres, and 12,360 farms to reduce it to +0.5 


acres. 
The averages of farm and crop acreage obtained through the survey 


























— 


59] Determination of a Change in Crop Acreage 407 


are used to determine year-to-year changes in crop acreage for the 
annual estimate of acreage harvested. Since all the reports are not 
from the same farms year after year, a direct comparison between the 
acreage devoted to the same crop in successive years cannot be made. 
Therefore the information for successive years for each crop was inade 
comparable by finding the proportion of one acre of farm land that was 
devoted to each crop in different years. This gives a ratio of crop 
acreage to farm acreage. In 1927, the sample of 1,000 farms showed 
an average of 18.1 acres of corn and 128.3 acres of farm land per farm. 
(Table I.) The ratio of corn acreage to farm acreage is 0.1411. Simi- 
larly, for wheat, the ratio is 0.1628 and for hay 0.0972. (Table II.) 


TABLE II 


CORN, WHEAT, AND HAY RATIOS AND THEIR RELATIVE PROBABLE ERRORS FOR DIFFERENT 
SIZED SAMPLES 


SepremBer Rvrat Carrier AcREAGE Survey, MARYLAND, 1924 anv 1927 


. Relative probable 
| Number of Ratio: Corn Ratio: Wheat Ratio: Hay 
| 


errors of the ratios 


























] 
| 
Year farms acreage to acreage to acreage to 
selected farm : Tes fs ™ ea fs ™ 3 age 
| at random a acreage ar acreage ar acreage | 2 
| Corn | Wheat} Hay 
| | 
_— — —_ = 
| | Per Per | Per 
| | } cent | cent | cent 
WO. ciacadniawkcun’ 100 0.1573+0.0059 | 0.19030.0089 | 0.0841+0.0055 | 3.78 | 6.03 | 6.50 
DR a osesnesacs 100 0.1282+0.0046 | 0.1509+0.0091 | 0.08610 .0053 | 3.58 | 4.70 | 6.17 
Ee | 100 0.1483+0.0060 | 0.1387+0.0102 | 0.0926+6.0069 | 4.06 | 7.38 | 7.46 
1927 100 0. 14340.0067 | 0.1637+0.0100 | 0.0970+0.0060 | 4.69 | 6.12 | 6.17 
1924 200 0.1526+0.0043 | 0.1635+0.0071 | 0.0885+0.0045 | 3.42 | 4.37 | 5.05 
1927 200 0.1353+0.0040 | 0.1569+0.0072 | 0.0912+0.0040 | 2.94 | 4.59 | 4.38 
| | | 
© SRE ae ee a | 300 0.1460+0.0037 | 0.1616+0.0058 | 0.0906+0.0034 | 2.54 | 3.60 3.75 
1927 300 0.1325+0.0031 | 0.1545+0.0058 | 0.0929+0.0035 | 2.35 | 3.74 | 3.76 
re 400 0. 1433+0.0032 | 0.1537+0.0050 | 0.0921+0.0031 | 2.24 | 3.24 | 3.41 
1927 400 0. 1337+0.0028 | 0.1552+0.0048 | 0.0964+0.0031 | 2.06 | 3.12 | 3.17 
TERE ESC Peer 500 0.1436+0.0028 | 0.1581+0.0044 | 0.0940+0.0028 | 1.92 | 2.76 | 2.93 
ee ae 500 0.1374+0.0025 | 0.1586+0.0045 | 0.0967+0.0028 | 1.80 | 2.81 | 2.90 
| i 
PEt 600 0. 1444+0.0023 | 0.1620+0.0039 | 0.0949+0.0025 | 1.61 | 2.42 | 2.62 
cc cauccsacrhies 600 0.1380+0.0023 | 0.1587+0.0041 | 0.09770. 0026 | 1.67 | 2.58 | 2.64 
© ARREVRSE Eerie areas 700 0.1439+0.0022 | 0.1620+0.0036 | 0.0955+0.0023 | 1.52 | 2.20 | 2.39 
GREE Stet 700 0.1400+0.0021 | 0.1602+0.0038 | 0.0985+0.0024 | 1.53 | 2.39 | 2.45 
rer ee 800 0.1435+0.0020 | 0.1596+0.0033 | 0.0962+0.0021 | 1.41 | 2.05 | 2.22 
__ieipetplispemaastiate 800 0.1395+0.0020 | €.1609+0 0036 | 0.0985+0.0022 | 1.43 | 2.23 | 2.2 
| 900 0.1434+0.0019 | 0.1599+0.0031 | 0.09490. 0020 | 1.33 | 1.92 | 2.10 
acre: 900 0.1403+0.0019 | 0.1625+0.0034 | 0.0984+0.0021 | 1.37 | 2.09 | 2.16 
er ee 1000 0.1434+0.0018 | 0.1590+0.0029 | 0.0943+0.0019 | 1.25 | 1.84 | 2.02 
g 0.1411+0.0018 | 0.1628+0.0032 | 0.0972+0.0020 | 1.29 | 1.96 | 2.05 








The stability of these ratios of crop acreage to farm acreage depends 
on the stability of the sample. Even when an average is sufficiently 
stable to show the average size of farm or the average amount of a crop 
grown, it might still be subject to such fluctuation that it does not indi- 






















408 American Statistical Association [60 


cate the change in the proportion of one acre of farm land devoted to 
the same crop in successive years. The ratios of corn acreage to farm 
acreage for 1927 range from 0.1282 to 0.1411, and for 1924 from 0.1433 
to 0.1573. (Table II.) In the samples with 600 or more farms 
differences between the 1927 and 1924 corn ratios appear in the third 
decimal. Therefore, it is necessary to have the acreage averages suff- 
ciently accurate for the third and fourth decimals to be reliable in 
order to obtain a safe indication of change in acreage. 

The slow decrease in the probable errors of the ratios! also indicates 
how difficult it is to obtain a reliable ratio unless an abnormally large 
number of farms are used. 

Size of farm is related to the amount of land devoted to important 
crops. Because of this relationship, the effect of the variation in 
acreage on the variation in the ratios is minimized. The coefficient of 
correlation between farm acreage and corn acreage is r=0.6200. In 
1927, the relative probable error of farm acreage was 1.4 per cent and 
of corn acreage, 1.6 per cent. On account of the relationship between 
the two variables, the relative probable error ? of their ratio is 1.29 per 
cent. In the next step, where the two ratios are compared, this rela- 
tionship is missing and the variability in the two ratios causes greater 
variability in the ratio relative. 

The ratio obtained from the two ratios of different years is called a 
ratio relative, and is intended to measure the percentage of change in 
the acreage of each crop between different years. The corn ratios are 
0.1411 for 1927, and 0.1434 for 1924, for 1,000 farms. (Table II.) 
Dividing the 1924 ratio into the 1927 ratio, the ratio relative, 98.40 
per cent, is obtained. (Table III.) This ratio relative shows that for 
corn there was a decrease of 1.60 per cent in acreage for 1927, com- 
pared with 1924. For wheat there was an increase of 2.39 per cent, 
and for hay an increase of 3.08 per cent. In every case the amount of 
change is about equal to the probable error of the ratio relative.’ 
There is an even chance that the ratio relative lies within the range of 


1 The probable error of the ratio (R) is calculated from the formula: 


0.6745R a \? oz \? 1 o2 |} 
P.B..=——| (— =} —a,—-— |’. 
R~ UN (=) + (=) “i 


G. U. Yule, An Introduction to the Theory of Statistics, pp. 214-215, 1927 Edition. 


2 The relative probable error of the ratio is obtained from: 


0.6745 a \? o2 \? o o2 |} 
R. P. E.p = — — — —2riz— *— 100. 
R~ NW [ (2) ™ (=) oo =| o 


R 


. . o77 . 
3 The probable error of the ratio relative — is: 
1924 





Ron o*Ris% oR =|} 
P. E. =0.6745 . + x<100. 
Riya [as a] 





61] Determination of a Change in Crop Acreage 409 


TABLE III 
RATIO RELATIVES FOR CORN, WHEAT, AND HAY AND THEIR RELATIVE PROBABLE 
ERRORS FOR DIFFERENT SIZED SAMPLES 


SEPTEMBER RuRAL CARRIER ACREAGE SURVEY, MARYLAND, 1924 anp 1927 








~~ 
Relative probable errors 
Rasher of Corn ratio Wheat ratio Hay ratio 
. relative, 1927 relative, 1927 relative, 1927 . Pe | 
selected at to 1924 to 1924 to 1924 Corn | Wheat | Hay 
random ratio ratio ratio 
relative | relative | relative 











Per Per Per | Per | | Per 
cent cent cent cent 
81.50 +4. 2: 79.29 6.0: 102.38+ 9.18 19 
69 =5.6 118.00 +11. 104.75+10.14 20 
88 .66 +3. 5! 95.96+= 6.03 | 103.06+ 6.89 05 
75 +31 95.61= 4.96 102.542 5 45 
93 .30 +2. 100.98 = 4.5: | 104.67+ 4 O4 
95.68 =2.5% 100.31+ 3 | 102.94 4 64 
95 .56 2.2% 97.96+ 3 102.95+ 3 33 
97.29 =2 98.89+= 3.2 103.142 3.5: 16 

3 3 

2 3 

2 





¢ 6 


| 
| 
| 








97.21 +1.9: 100.78 = 102.35 = 00 
97 .84 =1.87 | 101.63 = | 103.69 + 91 
1000 a | 98.40+1.77 | 102.39+ 2.75 103.08 + 80 


bet RODD NO ND GO CO 
boo Oo So ONC 


RO WO GO Oo Co ee 


the probable error and an even chance that it lies beyond these limits. 
Assuming that the true ratio relative is not beyond the limits of one 
probable error, then the range within which it may fluctuate is still 
sufficiently great to make the amount of change shown by a sample of 
1,000 farms not sufficiently precise for estimates of changes in acreage. 

The relative probable errors of the ratio relative ' show the combined 
effect of the variability of the ratios. The relative probable error is 
1.80 per cent for the corn ratio relative. For the corn ratios it was 
1.25 per cent in 1924 and 1.29 per cent in 1927. Because the annual 
farm samples are not made up entirely of the same farms year after 
year, the ratios show no relationship, and the combined effect of their 
variability results in greater variability in the ratio relative than in 
either of the ratios from which the ratio relative is obtained. 

The effect of increasing the size of the sample in reducing variation 
and fluctuations of sampling, grows less with each successive increase 
of 100 farms, since the square roots of the numbers increase at a decreas- 
ing rate with each increase of 100 reports. 

The relative probable error of the corn ratio relative is 5.19 per cent 
for 100 farms in one case and 6.20 in another. When the number of 
reports is increased four times, the relative probable error decreases 
about one-half. (Table III.) Increasing the number of reports nine 
times, reduces the relative probable errors about one-third. 

The indications of change in acreage depending on the four variables 


Ryn 
1 The relative probable error of the ratio relative we is: 
1924 





oe o 4 
R. P. E. =0.6745| —“0™_ 4_—Run_ |" x 100, 
RrgaNive R927Nien 











410 American Statistical Association (62 


are not very satisfactory unless a much larger sample were obtained. 
This would require a large number of reports and would involve a large 
amount of time and work. It is vital to crop reporting that the com- 
putations be made in a minimum amount of time, and this cannot be 
done without a very large number of reports.! Since it is impracticable 
to collect and handle such large samples, in the amount of time avail- 
able, three other methods of handling the smaller sample were tried. 

The method described above depends almost entirely on the size of 
the sample for the degree of stability that may be attained. In the 
second method, the farms were grouped on a regional basis, the state 
being divided into four districts, presumably more homogeneous than 
the state as a whole. The decrease in the variation of farm acreage 
was not very significant. In the case of crop acreage, a marked de- 
crease in variability occurred in districts that were important producing 
areas for corn, wheat and hay. In those areas where the particular 
crop was not very important, the variation was more than when the 
respective crops were taken for the whole state. 

Grouping by size of farm was the third method used. This stratifi- 
cation held the variation in farm acreage within narrow limits, but the 
acreage in any crop remains free to vary. The result is that the rela- 
tionship between acreage in any crop and farm acreage is not very 
significant, and consequently the probable errors are not much smaller 
than in the case of 1,000 farms for the whole state. 

The fourth and final attempt at reducing dispersion was based on a 
direct comparison of the crop acreage for the same farms in successive 
years. By using the same farms, farm acreage as a variable is elimi- 
nated and a direct comparison between acreage for any two years may 
be made. 

When the number of reports for the identical farms is held constant 
at 100, and for the unmatched at 400, the variability of the ratio for 
identical farms is about the same as the variability of the ratio relative 
of unmatched farms. In the case of corn and wheat, 100 reports for the 
same farms for two years are very nearly as reliable as 400 reports from 
unmatched farms. (Table IV.) For hay there is a difference of 13 
per cent. Theoretically, the identical farms are about four times as 
efficient as unmatched farms in increasing the precision of the indica- 
tion of change in acreage. Much less time is involved in making com- 
putations from identical farms than from unmatched farms when once 
the sample of identical farms has been obtained. 


1 In some central states samples of from 8,000 to 14,000 farms were handled in the field offices in 1928 
for the September Rural Carrier Acreage Survey. This mass of data taxes the field offices to the very 
limit. There are many states where samples of less than 3,000 farms are obtained and from these the 
results are not very satisfactory. 











Wi 
Ha 


san 
at | 


ar 








rge 
ym- 

be 
ble 
ail- 


of 


ate 
an 
uge 
de- 
ing 
lar 
she 








63] 


Determination of a Change in Crop Acreage 








TABLE IV 
COMPARISON OF DISPERSION IN UNMATCHED AND IDENTICAL FARMS 


SEPTEMBER RvuRAL CARRIER ACREAGE SuRVEY, MARYLAND, 1924 AND 1927 








Crop 


Relative probable errors * | 





Ratio relative for 
unmatched farms 


| 
Ratio for identical | 


farms 





Corn 
Wheat. . re 
i caceenmewwns 





Per cent 
2.44 
3.06 
3.81 





Per cent 
2.53 
3.04 
4.31 








sample. 


at 100 in order to make the results more directly comparable. 













411 


Per cent the relative 
probable errors of 
identical farms are of 
those of unmatched farms 





103.7 
99.3 
113.1 


* The size of the relative probable errors depends to a large extent on the number of farms (NV) in the 
For the unmatched farms the number was held constant at 400 and for the identical farms 


The acreage reports of the United States Department of Agriculture 
are largely based on reports from unmatched farms. 
Bill now before the House, if passed, will make it possible to obtain 
samples from the same farms or groups of farms for successive years, 
which will make the estimates of acreage and crop production more 


reliable. 


The Buchanan 

















412 American Statistical Association (64 


COTTON FUTURES AS FORECASTERS OF 
COTTON SPOT PRICES 


By Forrest Bee Asusy, University of Pennsylvania 


A cotton future is a contract under which one contractor engages to 
deliver to the other party, at a certain future time, a certain quantity 
of “middling” cotton, or its equivalent, at a certain price; and the 
second party promises to accept and pay for the cotton when delivered. 
A margin, relative to the market price, is posted and kept good by each 
party for the protection of the other during the usance of the contract. 

Several classes of cotton dealers buy and sell futures. A very im- 
portant class is the hedging element, which may either buy or sell fu- 
tures in order to protect trade profits arising from transactions in the 
actual cotton market. For instance, a spinner anticipates sales during 
the next several months which will necessitate the use of a thousand 
bales of a specific grade of cotton. He knows that if the present market 
price for his product is maintained he can afford to pay for his particular 
kind of cotton the present market price. He therefore contracts in the 
actual cotton market for 1,000 bales of a certain grade of cotton, de- 
liverable at a certain future time at a certain price, and then, in order to 
protect himself against changes in the market price of his particular 
grade of cotton he sells short a 1,000-bale future of approximately the 
same usance. If we assume that spot cotton falls in price during the 
life of his real contract, he loses money by his forward purchase, but 
(except for changes in difference and in basis) he gains approximately 
as much through his sale of the future, and maintains his original trade- 
profits position, while at the same time he is assured of a constant 
supply of raw material. The cotton merchant from whom this spinner 
made his forward purchase of real cotton, on his part might buy a 
future for 1,000 bales of middling, thus preserving, despite price fluctua- 
tions in the cotton market, his trade differential or gross margin of 
profit between the price at which he buys from the grower and the 
price at which he sells to the spinner. It is evident that the price of 
futures is a matter of little consequence to this hedging class of cotton 
dealers. Their entry into the futures market affects the price when the 
volume of futures bought for hedging purposes exceeds the volume of 
futures offered, and vice versa, but the hedging dealer is entirely con- 
tent to buy (or sell) “at the market,’ and has no incentive to haggle 
over the current quotation for the future which he selects. 





65] 





Ant 
arbitr 
whetl 
corret 
mont] 
differ 
arbitr 
May | 
to be 
the \ 
sional 
and u 
trans: 
for th 
a mov 
and s: 
arbitr 
mont! 
of fut 

Thi 
make 
whose 
the s} 
has be 
the ¢ 
count 
hedgi 
sees h 
mont! 
ever, 
or sh 
profit 
a high 
buy i 
curre! 

Th 
may 1 
a pou 
lesser 
mont! 
spot « 


delive 








65] Cotton Futures as Forecasters of Cotton Spot Prices 413 


Another class of dealers interested in the cotton futures market is the 
arbitragers or straddlers in the different futures. They care little 
whether or not the price of spot cotton at the maturity of the futures is 
correctly forecasted by the current quotations for deliveries in coming 
months, but are keenly interested in making a profit by leveling off any 
differences in adjacent futures. For instance, if in November an 
arbitrager finds it possible to buy a March future at 15 cents and sell a 
May future at 15.35 cents, assuming the cost of carrying and delivery 
to be three-tenths of a cent, he will purchase the March future and sell 
the May future, and will realize a profit in one of two ways. Occa- 
sionally he will take delivery in March, hold the cotton two months 
and use it to deliver upon his May future, but usually will close out his 
transaction by reversing his operations when the current quotations 
for the two futures come together at, say, 15.02 cents and 15.33 cents, 
a movement partially induced, moreover, by his own original purchase 
and sale. To the extent that arbitraging in different months resembles 
arbitraging between the present spot price and the spot price of coming 
months, the class of arbitragers just described blends into a third class 
of futures dealers. 

This third class of futures dealers is composed of those persons who 
make a market in futures for the hedging and straddling elements, and 
whose buying and selling limits are based upon their speculations as to 
the spot price of cotton in the month in which the futures mature. It 
has been said above that the hedging seller will sell futures regardless of 
the current quotation, since any loss or gain on the future will be 
counteracted by his opposite gain or loss on his “‘real’’ cotton; and the 
hedging buyer occupies a similar position. The straddler, of course, 
sees his profit from the first in the differences existing between the two 
months in which he deals. Persons falling within the third class, how- 
ever, are true speculators in coming spot prices, because they are long 
or short of the market in an unprotected position, and depend for their 
profits upon being able to deliver in the month of maturity cotton at 
a higher price than the price at which they have bought it or intend to 
buy it. It is this third class of speculators who ultimately set the 
current quotations for cotton futures. 

These current futures quotations, however, have an upper limit which 
may now be described. It costs between two and three cents to carry 
a pound of cotton for a year, depending upon its price level, and a 
lesser amount for shorter periods. Therefore, an eleven- or twelve- 
months future cannot sell for more than three cents above the current 
spot quotation, because at this point a dealer can buy spot, store it and 
deliver a year later on his future sale, thus protecting his position. At 














414 American Statistical Association [66 


this upper limit speculation in futures ceases, and straddling takes its 
place, the speculator thus becoming a straddler, as mentioned above.! 
It may also be said that the ultimate effect of reaching this upper limit 
in futures prices would be to draw the current spot price up toward the 
spot price for the delivery month, since the speculator-straddlers’ 
purchases for storage would increase the demand for immediate 
cotton.” 

On the other hand, the current quotations for futures have no lower 
limit. If the current price of cotton is 30 cents a pound, and if a cotton 
speculator chooses to contract to deliver cotton a year hence for 20 cents 
a pound, believing that at the time of maturity or at some previous 
time he can purchase that cotton at 19 cents a pound, there is no 
obstacle placed in the path of his operation by the current market situa- 
tion * (except, of course, the fact that quotations may not fluctuate more 
than two cents in any one day). 

The question may now be asked: With what success do professional 
and amateur cotton speculators forecast the price of cotton for the 
month in which they undertake to deliver or to receive cotton? Every 
speculator exercises his option to be either a bull or a bear, and if the 
result of their individual pressures upon the market is an accurate 
composite forecast of cotton prices, which is reflected in the current 
price of futures, we should expect to find the current price of futures 
approximately the spot price of the delivery month. 

The maximum usance of American cotton futures contracts is fixed 
at one year. Hence we find, in any month, contracts being bought and 
sold for delivery of certain quantities of middling cotton for every 
month in the next twelve. Certain months are by custom more used 
than others. Thus March, May and October are active months, while 
few transactions call for delivery in February or September. Some 
futures, however, are sold for all months, ana the prices of the inactive 
months are kept in line with those of active months by the operations 
of arbitragers. 

It would be possible, of course, to consider all futures in evaluating 
their accuracy in forecasting spot cotton prices, but for practical pur- 
poses samples must suffice. There have been chosen, therefore, three 
series of futures for analysis in this study. The first is a running series 

1 It will be noticed, by referring to Chart One that the price of eleven-months futures never rises more 
than a cent or so above the spot price of the month in which the future is sold. 

* It is interesting to realize that the Hebrew Joseph’s purchases of corn for Pharoah’s account during 
the seven fat years in ancient Egypt must have done much to increase the price during the time of plenty 
and to diminish the increase in price which came during the seven lean years which followed, when the 
stored corn was sold. 


* Chart One shows that the price of eleven-months futures often falls far below the spot price of the 
month in which the future was sold. 








— 


y- -——--- 7 


—————————————e 





8S 
— 
Cn 
— 
~~ 
Ss 
Q, 
KR 
SS 
“~ 
7S 
S 
iS 
> 
~® 
~~ 
L 
~~ 
~y® 
S 
SQ 
<8) 
x 
S 
Re, 
~® 
8 
~»D 
<¥) 
~~ 
= 
~~ 
= 
Re, 
bd 
=) 
3S 
=) 
© 


67] 


LUbl 


9b 


S$2b! 


421 eFbi 


tio! 


1Zb) 


OZbI 


bibl 


Ait} 


Ltbl 


41d) 


Ziel = +zibi Sibi Zibli Mbt Olb) 


bOb! 









































109 HOIHM NI H1INOW JO 
SH O311014S3IYALNI HLNOWI ~~~ 
(ANNOd W3d SLN32 NI $3DI¥d) 


AL YNLBW 10 HLNOW 40 
SY 2311014 ‘SIYNINZ Hinow-ll -~— 


31008 -——— 
S3YNLN4 QNH SLOdS_NOLLOD 
INO T§HHO 





































































































416 American Statistical Association [68 


of eleven-months futures to show the accuracy with which the long- 
distance forecasts are made. The second and third series comprise 
March futures sold in November and May futures sold in January, in 
order to ascertain the exactitude with which cotton prices are forecasted 
for short periods in the same crop years over a period when supply 
variables enter the price problem in lesser degree. 

In considering the first series of eleven-month futures, reference 
should be made to Chart One, in which three curves appear. The 
solid line represents spot prices in New York; the dotted line represents 
eleven-months futures prices, plotted as of the month sold;! while the 
broken line represents the eleven-months futures plotted as of the 
month of maturity.” 

Comparison of spot prices (solid line) with eleven-months futures 
plotted for the month of maturity (broken line), then, will indicate the 
exactness with which spot cotton prices are forecasted eleven months 
previously, while comparison of the spot price curve with the curve of 
eleven-months futures plotted as of the month sold (dotted line) will 
indicate the relations existing between the prices of spots and long 
futures sold concurrently. The conclusions below are based upon an 
analysis of the three curves over a period of 37 years. 

In the first place, it is noticeable that, in their long swings upward 
and downward, forecasted prices never rise as high as actual prices, nor 
descend aslow. This would seem to be additional evidence of the truth 
of the well-known generalization that futures speculation is an influence 
tending toward an equilibrium of prices and steadier trading, seeking 
always to establish a mean price and bring back the fluctuating spot 
curve to a norm. 

In the second place, it is evident that the price at which eleven-month 
futures are sold is much more closely related to the spot price in the 
month of sale than it is to the spot price in the month of delivery. 
Throughout its course, the futures curve, plotted as of the month sold 
(dotted line), follows the concurrent spots curve in all its fluctuations 
with consistent fidelity. The positive correlation is extremely high. 
Nor is this high correlation due to the fact that the futures price reaches 
the upper limit, relative to concurrent spots, which has been mentioned 
above,* because this upper limit is seldom reached, the futures price 
falling below the spot price of the month of sale in 75 per cent of the 437 
months covered in the chart. The futures reveal a strong tendency to 


1 That is, for instance, in January, 1923, cotton deliverable in December was bought and sold at an 
average price of 21 cents. 

? For instance, the figure 21 cents, mentioned in the preceding note, appears on Chart One in the 
broken line as of the month of December, 1923. 
*See p. 413. This limit is set by the possibility of buying cotton and storing it for delivery later. 











69) 


Error 
Error 
Error 
Error 
Error 


T 


*B 


- 


1Cc 
























69] Cotton Futures as Forecasters of Cotton Spot Prices 417 


lag behind the spot price of the month of sale by about the usance of the 
future.’ 

In the third place, it may be concluded that there is a certain fore- 
casting faculty in the futures curve because, as said above, when spots 
are very high the long futures are correctly sold for less, and vice versa 
when spots are very low (subject to the limit mentioned). But the 
degree of accuracy of the forecast is debatable. Eleven-month fore- 
casts of spot cotton prices appear for 443 months, and the results 
follow: 


Number 














Io e 

Per cent of error * | Per cent of 

| of cases total cases 
Error less than 5...... Saar alae isnt 65 15 
Error 5-10..... “e 69 | 16 
Error 10-15 65 15 
Error 15-20..... ; | 55 12 
Error over 20. 189 | 42 

SS ee 

Total forecasts 443 | 100 















¢| | | 
9 CHART; TWO |_| 
COTTON SPOTS AND FUTURES 































































} 
Y) 
Y 
“ AR 
| 7; Z 
ae 
: A 
e Z Z 
Va % Yj 
W227 ae 
es y ¢: 
ey & % 
wt - 
G44 : 
G54 ¥: 
=/} % ¥ y y 
‘| 4G4 F 
| Z44 Z 
| a y 
| AY) Y 
o ‘ 3 Y) 
4 | ay 3. 
| y 
Yj 
-2 Yj 
3 t Z 
4 
j 
Z 
Z 
~\ UV r 4 
MEME AY FUTURES SOLD IN JRNUARY Z 
SELINE REPRESENTS SPOT PRICE AT MATURITY) Z Y 
| | { Z | 
-}—_ } m | | ¢ Z = 
= O82 82 SSO eS SSVSOeT Dee - SES NISC SCS 29 23 2F 
SSSCC CSC c ess scesl ese rcr err Fer eelleE SEE 
1Compare the solid and broken lines, with the locus of the solid line preceding by eleven months 


in Chart One. 














418 American Statistical Association [70 


That is, over the 37-year period, two-fifths of the forecasts contained an 
error of over 20 per cent of the spot price, while in only 15 per cent of 
the cases was the error less than 5 per cent. The average error is be- 
tween 15 and 20 per cent. 

The short-time cotton futures are a running series of March futures 
sold in November and a series of May futures sold in December. 
These futures, if any, should accurately forecast spot prices at the time 
of maturity, because the world crop is in, the early spinners’ demand has 
been ascertained, and the controlling variable for the next several 
months should be the late spinners’ demand. It is true that by May 
the influence of the oncoming crop exercises some influence upon the 
spot price. 

Chart Two represents a comparison of spot prices in March and May 
with the futures sold in November and January respectively. The 
zero line represents the price of spot cotton in each month, as a base, 
while the hatched and solid columns represent the difference in cents 
between the actual spot price in delivery months and the prices set by 
the futures.! There are 74 forecasts. Using the base adopted for the 
long forecasts, the results are: 




















a pe Number Per cent of 
- oie of cases total cases 
Teper bees them &. ... 2... cccccecs aT yy Pane er Ree eee, ee 20 27 
Eeror &-10.......... ae ee ee ; Ee ee Le eat 13 17 
Error 10-15....... Pabarinia nae bheeiond ‘ Se aN eer 15 20 
SE BEMEED, «60 os cscccsees Sead italia earea ae alata da kt Re eee eee - 16 22 
Evror over 20.............. CAI pee Pe peep aie aS Poe ee 10 14 
EE eee The laamnanha Salus | 74 100 








Over the 37 years, then, a little over one-fourth of the forecasts were 
accurate within 5 per cent, and the average error is a little over 10 per 
cent. 

Of interest are the reactions of two critivs to whom this study was 
presented in manuscript. The second authority had read the criticism 
of the first. 

In the opinion of the first critic: ‘There is no future price of cotton 
today. All prices are current prices made today on the basis of today’s 
information, and not reflecting tomorrow’s information in any degree 
except as it is foreshadowed by today’s. Prices, rather than being tied 
to future months’ probabilities, are all tied together by the straddlers so 
that any current information which might affect next May’s deliveries 

1 For instance, Chart One shows the spot price in March, 1911, to be 14.5 cents per pound. Chart 


Two shows that in November, 1910, March futures were offered at one-tenth of a cent higher, or 14.6 
cents per pound, illustrating an accurate forecast. 








71) 


ser 
mol 
for 
thei 
7 
curl 
flect 
Iw 
tods 
in tl 
the 
by 1 
It 
pric 
this 
The 











71] Cotton Futures as Forecasters of Cotton Spot Prices 419 


serves to move the whole current structure. Futures may not sell at 
more than the spot price than by the amount of the carrying charge, nor 
for less than by the amount for which cotton users are willing to defer 
their purchases.” 

The second criticism: “It is only partially true that ‘all prices are 
current prices made on the basis of today’s information, and not re- 
flecting tomorrow’s information except as foreshadowed by today’s.’ 
I would say that the price of the future is the sum of all knowledge 
today as to what the price will be in the month of delivery. The change 
in the future price, and the difference between the price of futures and 
the price of spots in the month of delivery reflects partially the amount 
by which dealers have to change their minds.” 

It is to be concluded that speculative prices cannot rise above spot 
prices by more than the amount of the carrying charges, and that below 
this point speculation has free range as to cotton prices in the future. 
The past accuracy of these speculations is portrayed in the charts. 

























American Statistical Association 


NOTES 


THE RELATIVE IMPORTANCE OF CHECK AND CASH 
PAYMENTS IN THE UNITED STATES, 1919-1928 


By Artuor F. Burns, Rutgers University 


It is a matter of common knowledge that the bulk of payments in 
this country are effected by the use of checks. But what the ratio 
of check to cash payments is can only be guessed. David Kinley 
estimated in 1910 that check payments account for something like 
80 or 85 per cent of the total volume of pecuniary transactions. 
Irving Fisher advanced an estimate of 91 per cent in his Purchasing 
Power of Money. Wesley C. Mitchell, in his recent Business Cycles, 
though he does not venture a definite statement, leans to the figure 
of 85 per cent. In financial literature, estimates of 90 per cent are 
most frequent. 

If the velocity of circulation of money were known, the proportion 
of check to cash payments in the United States would admit of easy 
computation, inasmuch as all other statistical requisites for that pur- 
pose are available in one form or another. The estimates of “total 
debits” by Morris Copeland are satisfactory enough as measures of 
the volume of check payments, if a small deduction is made for with- 
drawals from time deposits and for checks cashed by individuals 
drawing on their own accounts. But there are no figures on the 
volume of cash payments corresponding to the data on check transac- 
tions. To be sure, an excellent series of ‘‘money in circulation”’ is 
available, but the total volume of cash payments could be determined 
only if these figures were supplemented by statistics of turnover of 
money. However, though the actual ratio of check to cash transac- 
tions cannot be calculated, the relative changes in that ratio can be 
ascertained roughly. The variations in this ratio are of considerable 
interest, for they are the numeric reflections of our changing habits in 
the making of money payments. 

That payments by checks are of increasing relative importance is 
suggested by the figures of money in circulation (M) and deposits 
subject to check (M’), as given in Table I. The volume of money in 
circulation was practically constant during the decennium 1919-1928, 
while the volume of checking deposits increased almost 50 per cent. 




















Notes 


TABLE I 
SOME INDEXES OF MONEY PAYMENTS! 














| Money in Deposits Volume of Velocity of 

— circulation subject check circulation 

. | to check payments of deposits 
(M) (M’) (M’ V’) (V’) 

| | | 

tas 100.0 100.0 | 100.0 100.0 
ee 111.4 111.0 108.0 97.3 
1921 | 101.3 | 103.4 | 88.3 85.4 
1922 92.6 | 107.8 96.3 89.3 
éseess | 97.7 116.4 102.8 88.3 
1924 99.1 123.9 | 108.1 87.2 
a 98.5 136.8 124.4 90.9 
99.5 143.5 132.5 92.3 
1927 98.0 145.9 145.4 99.7 
1928 96.1 145.9 170.1 116.6 





1 The index of money in denieiien was ests ablished in the following manner : (1) Annual averages 

were struck of the ‘ ‘monthly averages of daily figures of money in circulation” pat by the Fed- 
eral Reserve Board; (2) a deduction was then made of the estimated amounts of money held by (a) 
national banks, no hw at by averaging the “call date’’ figures, (b) * ‘banks other than national,’ 
established through multiplying the annual averages of ‘call date”’ figures for national banks by the 
ratio that the money held by ‘banks other than national’’ on June 30 bears to the money held by 
national banks on that date; (3) the absolute figures were then converted into indexes (more accu- 
rately, relatives). No attempt was made to estimate the amount of money hoarded, destroyed, or in 
circulation in foreign countries. In the first few years of the period, the last factor was not unimpor- 
tant. (Fifteenth Annual Report of the Federal Reserve Board and Reports of the Comptroller of the Cur- 
rency.) 
The index of deposits subject to check is a conversion into relative form of Carl Snyder's estimates 
of this variable. The figures for 1919-1925 were taken from W. C. Mitchell. Business Cycles—The 
Problem and Its Setting, p. 126. The figures for the last three years (the estimates for 1927 and 1928 
are preliminary only) were furnished by the New York Federal Reserve Bank. 

The index of check payments is based on the estimates of ‘total debits’ made by Morris Cope- 
land. The figures for 1919-1927 are given in this JouRNAL, September, 1928, p. 303. The figure for 
the last year was supplied by the New York Federal Reserve Bank. Since indexes only were em- 
ployed, it was not feasible to make what would have to be arbitrary deductions for debits to time 
accounts and for checks cashed by individuals drawing on their accounts. 


When the turnover of money and deposits is taken into account, the 
suggested inference, that the ratio of check to cash payments is a 
rising magnitude, is strengthened. 

It may be expected on a priori grounds that the velocity of circula- 
tion of money (V) will show a high degree of “secular stability.” 
The size of cash balances kept by people will vary with the industrial, 
mercantile, and banking arrangements of the community. But in 
any given place, a fairly fixed system of habits with respect to the use 
of money will obtain. Changes in these habits will be very gradual, 
except when some extraordinary factor, such as a large-scale inflation, 
deranges the customary ways. 

General analysis suggests further that the velocity of circulation 
of money is characterized also by a high degree of “cyclical stability, ”’ 
and that cyclical changes in the turnover of money are probably 
concordant with, but of lesser amplitude than, the fluctuations in 
deposits activity (V’). Cash is used to a considerable extent in transac- 
tions that are relatively free from wide cyclical oscillation, such as 
purchases at retail, payments of rent, etc. Checks, on the other hand, 
are used in large measure in connection with business activities that 
are marked by extensive cyclical swings. The cyclical variations in 















































American Statistical Association (74 





422 


the volume of pecuniary payments are explained only in part by the 
movements of M and M’, for the cyclical amplitude of M and M’ js 
relatively small. The fluctuations in the turnover of media of payment 
as well as in their volume, then, account for the cycles in the aggregate 
of payments. But it is known empirically that the turnover of |’ 
undergoes cyclical changes that are, speaking very broadly, similar 
to those in the physical volume of trade. The swings of the latter, 
it is superfluous to add, are extensive. And these several facts—the 
differential character of the transactions in which checks and cash 
are generally employed, the narrow cyclical amplitude of M and M’, 
and the rough congruence of the cycles of deposits activity and trade— 
clearly imply that the amplitude of the fluctuations in the velocity of 
circulation of M is substantially smaller than that of WM’. 

Under certain conditions, finally, the relative velocities of circulation 
of money and deposits may diverge widely. In a period of intense 
speculative activity, for example, the heightened turnover of deposits— 
in so far as it is occasioned by the speculative factor—will not be ac- 
companied by a rise in the velocity of circulation of money. 

With these several general considerations as a guide, the changes 
in the ratio of check payments (.Z7’V’) to cash payments (MV) may 
be computed. Four sets of such computations, based on varying 
assumptions as to the movements of V, are presented in Table II. 


TABLE II 
INDEX OF RATIO OF CHECK TO CASH PAYMENTS 





— 
Baas _—_ , When (3) is cor- 
| . a When V=V’ | When amplitude) Ds shorten 
} W hen V is be sakedinie of V =i of am- | rected for specu- 














Year a constant Sem | plitude of V’ | ee 
(1) (2) (3) (4) 
2 100.0 100.0 | 100.0 100.0 
1920....... 96.9 99.6 98.4 98.4 
so01....... 87.2 102.1 94.1 94.1 
1922 cal 104.0 116.4 109.8 109.8 
ee .| 105.2 119.1 111.9 111.9 
_  eEeEee | 109.1 125.1 116.6 116.6 
~ 126.3 | 139.0 132.2 132.2 
i tcedaeuee ues aia 133.2 144.3 138.3 138.3 
OS ahs are | 148.4 148.8 148.1 154.0 
ee oad | 177.0 151.7 162.8 183.9 

477 
Column 1 gives an index of VV when V is taken as a constant. In 
477 
the second column, we have an index of uv when V is assumed to be 
¢T77 


equal to V’, in relative form. In the third, the index of WV is de- 


termined on the hypothesis that V varies in harmony with V’, but 





in 1 
in 
It 
rati 
exp 
in | 
ay 
ind 
adv 
“ne 
“ac 
clos 
vali 
figu 
“les 

I; 
in 1 
rati 
the 
late 
cour 
If tl 
tual 
to b 
to c: 
the ‘ 
93.1 


TI 














Notes 423 





75] 
that variations in V are only one-half as extensive as changes in V’. 
The computations for 1927 and 1928 in the second and third series are of 
little significance, for it is unlikely that the stupendous stock market 
activity which probably accounts for the sharp advance of V’ during 
this two-year period exercised any influence on V. To correct for the 
speculative factor, another computation of the relative changes of 
M'V’ 
MV 
that the turnover of money in 1927 and 1928 is taken to be the same as 
the turnover in 1926. The series in column 4 is probably the best 
approximation of an index of the ratio of check to cash transactions. 
M’'vV’ 2 ; 

My Beree substantially 
in their broader features. The maximum is reached in every instance 
in the terminal year, and each series shows a distinct upward trend. 
It is impossible to generalize about the cyclical fluctuations of the 
ratio of check to cash payments from data covering so short a period and 
expressed in yearly units, but it is noteworthy that the minimum falls 
in 1920 or 1921 and that there is a sharp upward movement in 1922, 
a year of business ‘‘revival.’”’ Excepting the series in column 2, each 
index shows a sharp rise in 1927 and 1928. The peculiarities of this 
advance have already been commented upon. Needless to say, the 
“normal”’ values for this two-year period are distinctly lower than the 
“actual” values. In the case of the fourth series, which accords most 
closely with our a priori analysis of the variations of V, the “normal” 
value for 1928 cannot be much above 150. In fact, the ‘‘normal”’ 
figure for 1928, when extrapolated from a straight-line trend fitted by 
“least squares”’ to the period 1919-1926, is only 146. 

If it be assumed that the ratio of check to cash payments was 9 to 1 
in 1919, and that the calculated value of the relative change in that 
ratio from 1919 to 1928 as given in Table II, column 4, is correct, then 
the ratio of check to cash payments was 16.55 to 1 in 1928. Trans- 
lated into more familiar terms, this means that check payments ac- 
counted for 94.3 per cent of the total volume of payments in 1928. 
If the relative change in the “normal”’ ratio (rather than in the “‘ac- 
tual”’ ratio) of check to cash payments is taken, and if this is assumed 
to be 50 per cent higher in 1928 than in 1919, then the ratio of check 
to cash payments for 1928 becomes 13.5 to 1. Or, using percentages, 
the “normal” values for check and cash payments in that year were 
93.1 and 6.9, respectively. 


is given in column 4; this differs from the third series only in 





The several sets of calculated indexes of 


M’vV’ 


Vy may seem questionable because of 


The computed indexes of 










































424 American Statistical Association (76 


the assumptions employed with respect to V. Light may be thrown 
on the reasonableness of the procedure by exploring an alternative 


wry 


hypothesis. If we postulate that Tad is a constant, we arrive at an 


index of the velocity of circulation of money which in 1926 is 33 and in 
1928, 77 per cent greater than in 1919 (the figures in Table II, column 1, 


‘T77 


4 


give at once an index of V when Vr is a constant and an index of 
M'vV’ 
MV 
light of the horizontal “trend” of V’. Furthermore, over a period of 
ten years so large a change in social behavior with respect to the 
keeping of cash balances could be occasioned only by some such un- 
usual circumstance as a sharp rise in the price level, the overhauling of 
the banking system, or the like. It appears, then, that at least this 
one general conclusion is inescapable: There was a very substantial 
advance in the proportion of check payments to cash payments during 
the period 1919-1928. 


when V isa constant). This certainly seems implausible in the 











77] 


O 
Am 
Stre 
pres 
side 

T 
Kro 
stor 
that 
has 1 
poin 
ing | 
not ¢ 
its pi 

M 
for t! 
was : 
of 01 
for n 

indey 
Com 
$243, 
mark 
Mi 
critic 
broug 
the n 
due ts 
of cor 
secon 
grew | 
in the 
Mr 
francl 
the p 
essent 
field. 
Mr. 
on the 
tender 
needs, 
tain gi 
pointe 


bracke 





-— 6 


Ss = 





Notes 


THE NEW TREND IN DISTRIBUTION 


One of the most successful dinner meetings of the New York Branch of the 
American Statistical Association was held at the Fraternity Club, 22 East 38 
Street, New York City, on Thursday evening, October 17, 1929. Over 300 were 
present. Mr. Jesse Isidor Straus, president of R. H. Macy and Company, pre- 
sided at the meeting. 

The first speaker of the evening was Mr. Albert H. Morrill, counsel for the 
Kroger Grocery and Baking Company, Cincinnati, Ohio. He spoke on the chain 
store development in the food field, and its probable growth. Mr. Morrill stated 
that 90 per cent of businesses are failures, but that to date no such thing as failure 
has taken place in the chain store field. One of the values of the chain store, he 
pointed out, is that of relieving the credit structure of the country by necessitat- 
ing cash payment for merchandise purchased. This relief to credit has been 
not only in the store’s sales for cash but also in the store’s ability to pay cash for 
its purchases. 

Mr. Morrill stated that, based on the known savings which the chain makes 
for the consumer as compared with the independent store, during one year there 
was a saving to the consumers in the United States by the chain grocery system 
of over $300,000,000. He also claimed that the consumer pays only 87 cents 
for merchandise bought in the chain store for which he would have to pay the 
independent store $1. For example, in the case of Kroger Grocery and Baking 
Company, in the year 1928, the public paid $207,372,551 as compared with 
$243,510,000 it would have paid the independent retailer, if the difference in 
mark-downs are considered. 

Mr. Ralph Borsodi, director of the Fairchild Analytical Bureau, spoke on the 
critical phase of mass distribution. One of the chief points that Mr. Borsodi 
brought out in his discussion was that the chain store has no monopoly and that 
the merged or consolidated organizations in the past have been successful largely 
due tomonopoly control. This has been particularly true in the previous periods 
of consolidations, characterized first by the Standard Oil Company in 1870 and 
second by the United States Steel Corporation in 1897. Both these corporations 
grew to their present magnitude largely due to monopoly control which is lacking 
in the chain store development. 

Mr. Borsodi stated that the mass distributors have not the mines, patents, 
franchises, or trademarks to protect them, as is the case of monopoly in 
the production field. He further contended that scientific management is 
essential and necessary to any successful development in the chain store 
field. 

Mr. W. T. Grant, chairman of the board of W. T. Grant and Company, spoke 
on the economic importance of the general chain store development. He con- 
tended that one of the things the chain store does is to cater to the customer’s 
needs, and that its success has been due to its having sensed the needs of a cer- 
tain group of consumers more accurately than have most distributing agents. He 
pointed out very clearly that the greatest population is in the small income 
bracket. Therefore catering to that element and enabling them to satisfy 

















































426 American Statistical Association [78 


more of their desires have resulted in making the chain store an economic 
need for the American public. 

Mr. Grant also stated that he is not in favor of factories owning distributing 
branches or of distributors owning factories. 

One of the interesting points he made is that there is still a tremendous waste 
caused by overlapping of effort. Mr. Grant expects, however, that the weaker 
ones will gradually be eliminated, especially those serving no real economic 
purpose, and that the chain store in the future will be given its proper place in 
all our great distributing centers, and will be recognized as a great contributing 
factor to the welfare and prosperity of the country. 

Malcolm P. MeNair, Associate Professor of Marketing at Harvard University, 
spoke on operating expense and profit ratios in department stores. Professor 
MeNair stated definitely that the chain store will get a greater portion of 1929 
retail sales. He maintained that it will be necessary for the independent depart- 
ment store to reduce its ratio of operating expense if it is to gain supremacy in 
the distribution field. The various studies made by Harvard show that the ex- 
pense ratio has been gaining steadily with the result that there has been a steady 
decline in net profits. Specialty stores, however, show a higher ratio of net prof- 
its. The specialty stores doing a volume of over a million dollar business are 
much better than those under the million dollar mark. 

The most interesting feature of Professor McNair’s paper was the conclusion 
drawn from the vast data compiled by the Harvard Bureau, which is as follows: 

1. With few outstanding exceptions, the general department store has reached 
the peak of its development. 

2. The departmentized specialty store may have greater possibilities for de- 
velopment. 

3. The primary factor for the success or failure of department stores rests with 
the management. 

4. The management of department stores at present is probably inferior to 
that of chain stores. 

5. Department store supremacy may be restored by reducing its ratio of 
operating expense to a point where it will be unnecessary to tax the 
consumer. 

The last speaker of the evening was Mr. Philip LeBoutillier, president of 

Best and Company. He spoke on “Some Modern Retail Trends.” 

Mr. LeBoutillier contended that the greatest handicap of the individual retailer 
until recently has been the ownership management. He stated that the best 
results could be expected with the management a substantial owner but with the 
bulk of responsibility upon a large general public ownership. He believes that 
the study of distribution is a hopeful sign and that the individual realizes the 
need of adjusting conditions to meet the changing tide. 

Mr. LeBoutillier indicated that there still exist thousands upon thousands 
of small independent stores despite the great chain store development. 

Dr. Paul W. Nystrom, Professor of Marketing at Columbia University, in his 
discussion, pointed out that the independent retailer is still a controlling factor 
and will continue to be so in the future. He maintained that the small inde- 











79) 


pen 
a CC 
N 
out 
dep: 
sale: 
M 
Woc 
been 
store 
Duri 
fract 
M 
1927 
They 
profi 
chair 
sales, 
mail 
Mr 
of ch 
avera 
the it 
assets 
order 
since 


In ¢ 
study | 
of the ; 
have b 
main s 
discret 
that “4 
determ 
Comme 
adding 
previou 


sidered 














Notes 427 





79) 


pendent retailer is able to exist by virtue of the fact that his whole family forms 
a component part of the organization. 

Mr. A. W. Zelomek, statistician of the Fairchild Analytical Bureau, pointed 
out that, despite the sharp increase in sales of various chains, the independent 
department store has more than kept pace if a comparison is made on the basis of 
sales per store. 

Mr. Zelomek stated that, with the exception of the variety chains such as 
Woolworth’s and Kresge’s, the ratio of growth for department store sales has 
been higher than that for any other retail group. Following the department 
store growth is the J. C. Penney chain, itself a department store organization. 
During the period studied, Mr. Zelomek added, grocery chains showed only a 
fractional gain, while shoe chains showed an actual decline. 

Mr. Zelomek gave a comparison of the ratios of net profits to net sales during 
1927-1928. These were based on the companies listed on recognized exchanges. 
They show that the specialty store chain has maintained the highest ratio of net 
profits to net sales, with the mail order chains following, and then the general 
chains. The chain department showed the smallest ratio of net profits to net 
sales, i. e. 3.7 as compared with 6.3 for the independent department stores; 
7.0 for the independent specialty stores; 7.6 for the general chains; 7.8 for the 
mail order chains, and 7.9 for the specialty store chains. 

Mr. Zelomek also pointed out that the ratio of inventory to assets for a group 
of chain department stores and mail order chains was considerably below the 
average. He maintained that considering the importance of mark-downs and 
the increasing importance of the style element, the high ratio of inventory to 
assets for the two forms of chain distributors (chain department stores and mail 
order chains) must be regarded as constituting a serious problem, particularly 
since it apparently tends to be aggravated by the addition of new units. 

A. W. ZELOMEK 


PROGRESS OF WORK IN THE CENSUS BUREAU 


THE CENSUS QUESTIONS 


In connection with the preparations for the coming census a great deal of 
study has been given to the formulation of the schedules, involving the selection 
of the items or questions that are to be included. In previous censuses the items 
have been specified in the census act. But the present law, while it specifies the 
main subjects to be covered by the census, leaves the question of detail to the 
discretion of the Director and the Secretary of Commerce under the provision 
that “the number, form, and subdivision of the inquiries in the schedule shall be 
determined by the Director of the Census with the approval of the Secretary of 
Commerce.” This is doubtless a wise change in the law, because the matter of 
adding a new question to the schedule or eliminating a question included in 
previous censuses is one which requires the judgment of experts and must be con- 
sidered from several angles. There is danger of overloading the census with de- 






































— 


428 American Statistical Association [80 


tail, so that as regards any proposed question one must consider not only the 
value of the information in itself but its relative value as compared with other in- 
formation which might be obtained by some alternative question or questions, 
One must consider also the probable degree of accuracy in the answers to the 
proposed question, and the practicability of obtaining the desired information 
through the agency of a census. It must be remembered that the enumerator 
does not by any means see or interview every inhabitant. He must get the data 
for all members of the family from the member or members who happen to be at 
home at the time of his call. The adult male members of the family are quite 
likely to be absent at their places of employment. The questions should be 
such as will be readily, accurately, and willingly answered by whomever the 
enumerator may find at home. That limitation cuts out many topics of inquiry 
which would be of great value if only we could count on getting the correct 
answers without delay or embarrassment. 

On this important matter of selecting and limiting the questions on the sched- 
ule the Bureau is trying to get the best counsel available. Advisory conferences 
have been called by the Secretary of Commerce to consider the scope and formula- 
tion of the schedules—one conference on population, another, with a different 
personnel, on manufacture, a third on distribution, and a fourth on unemploy- 
ment. In formulating the agricultural schedule, the Bureau has been in constant 
conference with representatives of the Department of Agriculture. 


THE POPULATION SCHEDULE 


The population schedule to be used in 1930 will in the main be the same as 
that used in 1920 and earlier censuses. Most of the questions are standard 
questions which are basic and essential in any census of population, such as age 
and sex. No one would think of omitting them; and conservatism in making 
changes is advisable because much of the value of the information obtained in 
each decennial census depends upon its comparability with information obtained 
in previous censuses. Nevertheless in this, as in every previous census, there 
will be some modifications and innovations to meet changing conditions and new 
interests. 

Each census since and including that of 1850 has asked the nativity (state or 
country of birth) of each person enumerated; and in 1880 there were added 
questions calling for the nativity of each parent (father and mother). These 
questions will be retained on the census schedule for 1930. The inquiry as to 
the mother tongue or native language, which was introduced in the census of 
1910 and repeated in 1920, will be retained for the foreign born though not for 
foreign-born parents. 

It is proposed to omit also the question asking for the year of naturalization, 
which of course applied only to naturalized citizens. The question whether 
naturalized will be retained and the year of immigration will as heretofore be 
reported for the foreign born. 

The two questions, whether able to read and whether able to write will be 
consolidated into a single question, whether able to read and write. 

For the information of the Veterans’ Bureau two new questions have been in- 

















H 


u 








81] 


cluc 
whs 

T 
cluc 
It i 
nun 
tive 
bee 

A 
em] 
tio 
the! 
sch 
vel 


( 
mos 
fart 
due 
var 
sep: 
COV 
clas 
the 
for 
fart 
suc 
and 

7 
whi 
of t 
100 

T 
tim 
pen 
elec 
on 
the 
the 

A 
fror 
solc 
pric 
the 
pro 





— oe ww 





429 





81] Notes 


cluded to ascertain whether the person enumerated is a war veteran and if so in 
what war he served. 

The schedule will include the question ‘‘age at marriage,’’ which was not in- 
cluded in 1920. It was included in 1910 and in 1900 but was never tabulated. 
It is believed that the answer to this question if tabulated in connection with 
number of children reported in the family will afford a reliable index of the rela- 
tive fertility of various population groups as represented by women who have 
been married not more than 10 or 15 years. 

Another new question on the population schedule is the one relating to un- 
employment, asking regarding every person usually engaged in a gainful occupa- 
tion whether actually at work at the time of the census. If the answer is “ No,” 
then, as explained in the September issue of this JouRNAL, a supplementary 
schedule must be filled out carrying a dozen or more questions designed to de- 
velop the reasons for not being at work. 





THE AGRICULTURAL SCHEDULE 


Of all the schedules used in the decennial census the one for agriculture is the 
most elaborate. One reason for this is found in the variety of crops grown on 
farms in the United States. Information as to acreage and the quantity pro- 
duced is sought regarding all the major crops and most of the minor ones. The 
various kinds of domestic animals on farms—cows, pigs, horses, ete.—must be 
separately reported and in more or less detail by age and sex. Other subjects 
covered in the schedule include the acreage of the entire farm and of certain 
classes of farm land such as crop land and pasture land; the tenure under which 
the farm is operated; the total value of the farm with separate items for buildings, 
for dwelling houses, and for implements and machinery; farm debt; principal 
farm expenditures; land drained; land terraced; farm machinery and facilities 
such as automobiles, tractors, telephone, radio, etc., each separately reported; 
and other items. 

The proposed schedule for agriculture includes at present about 360 inquiries, of 
which over 50 are new inquiries not carried in the census of 1920 although some 
of them were in the special agricultural census of 1925. On the other hand, over 
100 questions carried in the census of 1920 will be omitted at this coming census. 

The following are some of the more important inquiries to be made for the first 
time at the census of 1930: Value of the farmer’s dwelling house; such farm ex- 
penditures as purchase of and supplies and repairs for automotive vehicles, and 
electric current; number of combine harvesters, electric motors, and gas engines 
on the farm; the daily production of milk and eggs at the time of the census; 
the number of baby chicks bought; and the number of hides and skins sold from 
the farm. 

A large part of the 100 questions carried on the schedule for 1920 but omitted 
from that for 1930 pertained to values of livestock and quantities of products 
sold or to be sold. The livestock values for 1930 will be computed from average 
price figures, by counties, to be furnished by the Department of Agriculture; and 
the quantities sold or to be sold will be reported for only a limited number of 
products. 





































































430 American Statistical Association [82 


In addition to the main agricultural schedule there will be a separate schedule 
for the enumeration of livestock, chickens, and bees not on farms or ranges; an- 
other for special fruits and nuts; and a third for irrigated crops. 


THE MANUFACTURES SCHEDULE 


The census of manufactures to be taken next year will cover the operations of 
the year 1929 and may be looked upon both as a part of the general decennial 
census and as a continuation of the biennial census. The last previous biennial 
census covered the year 1927 and the last decennial census covered, of course, 
1919. 

The scope of the coming census of manufactures will be considerably expanded 
over that of the biennial census of 1927 and will include several items not covered 
by the census of 1919. Under the subject of “Time in operation and hours of 
labor’ questions have been added asking for the number of hours of plant opera- 
tion per day and per week, and the number of shifts per day; also the number of 
days per week for the individual wage earners with a statement of the basis, 
namely, whether a six day, five and one-half day, or five day week, or some other 
basis. 

Each of the principal classes of employees as reported for December 14 or 
nearest representative day is to be given by sex, which has not been done since 
1919. The total amount paid in salaries to the principal officers of corporations is 
to be distinguished from that paid to subordinate officials and clerical employees. 

The value of products is to be the net selling value at the plant of all products 
actually shipped or delivered to customers during the year. In previous cen- 
suses it has been the selling value at the factory or works of all products manv- 
factured during the year whether sold or not. The sales are to be classified 
according to class of purchaser, distinguishing sales to manufacturers, sales to 
wholesalers, sales to retailers, etc.—eight classes in all. This information is 
desired in connection with the census of distribution to complete or supplement 
the information obtained from the schedules filled out for merchants and dealers. 

Under the subject of power equipment there is to be a separate item showing 
the total horsepower of prime movers not ordinarily active and the total capacity 
of generators not ordinarily active. Heretofore these have been included respec- 
tively in the total horsepower and the total generator capacity but have not 
been shown separately. 

Under “fuel” both the quantity and the cost of each kind of fuel are to be re- 
ported. The schedule for 1927 reported separately the quantity of anthracite 
and bituminous coal but did not call for any further detail in regard to fuel. The 
schedule for 1919 called for the quantity only of each kind of fuel. 

The schedule will contain a series of questions asking in regard to each manu- 
facturing plant (1) whether it began operations after January 1, 1928, (2) whether 
it has changed its name, or location, or ownership or general nature of business 

since January 1, 1928, and if so (3) what was the former name, or location, or 
ownership or general nature of business. On the basis of the answers to these 
questions it will be possible to compile statistics regarding the migration of in- 
dustry, which will be an entirely new subject in the census of manufactures. 








83 


ret 
th 
nu 
an 
ex 
cre 
to 
cla 
C0! 


We 
cia 
cal 
ex! 











Notes 431 





83] 


THE DISTRIBUTION SCHEDULES 


As this will be the first census of distribution the questions are all new. In 
general this census will cover all establishments engaged in trade, wholesale or 
retail, and the subjects of inquiry will include a description of the business naming 
the classes of goods dealt in; the number of proprietors of firm members and the 
number of employees, distinguishing part time from full time; the aggregate 
amount paid in salaries or wages to each of these two classes of employees; total 
expenses other than salaries and wages; total sales, distinguishing cash from 
credit sales; farm products bought or taken in from farmers; and some other 
topics. So far as practicable sales will be shown separately for the principal 
classes of commodities. In the case of wholesale trade the classification of 


commodities by broad classes will probably be fairly complete. 
J. A. H. 


MISCELLANEOUS NOTES 


Association Activities in Pittsburgh.—A meeting of the group was held on October 
24 as a farewell to Professor John H. Cover, former secretary for the area. Dr. 
Cover, the speaker of the evening, discussed “The Value to the Business Community 
of Business Research in the University.’”” The meeting was splendidly attended and 
the paper itself very well received. 

In January Dr. Seymour L. Andrew, of the American Telephone and Telegraph 
Company, will address the Annual Meeting of the group. It is expected that 
President Edwin B. Wilson will preside and that Secretary Willford I. King will be 
present. 


The Los Angeles Chapter—A meeting of the Los Angeles Chapter was held 
Wednesday evening, October 2, at the University Club. Mr. Jack Lasdyck, statisti- 
cian of the Los Angeles Stock Exchange, was the speaker and he discussed the statisti- 
cal work and methods of the Los Angeles Stock Exchange, presenting a number of 
exhibits in explanation of this work. Fifteen members were present. 


United States Bureau of Labor Statistics.—The labor turnover figures now being 
published monthly by the Bureau are growing in volume. There will, however, have 
to be a much greater increase in the number of establishments reporting before these 
figures will have their maximum value. 

The annual collection of data on union wage rates and hours of labor has been com- 
pleted and the bulletin giving complete figures is in course of preparation. Summary 
articles have appeared in the Monthly Labor Review. The scope of the union wage 
work is being expanded. Reports are being obtained in several cities for all organized 
trades having effective scales in those cities. Such trades as are not covered in the 
regular report of the Bureau will be represented in a separate tabulation. 

The tabulation of figures relating to wages and hours of labor in the iron and steel 
industry is nearly complete. Summaries of the data for different departments of the 
industry have been published in the Labor Review. 

The bulletin giving wages in the bituminous coal-mining industry is in course of 
preparation. Summary figures have appeared in the Labor Review. 

Field work on the collection of wages and hours of labor in foundries and machine 















































































432 American Statistical Association [84 


shops and in the oil well and pipeline industries has been completed and the data are 
now being tabulated. 

A large fund of information has been collected concerning the productive efficiency 
of labor in longshore work and its tabulation is in progress. 

Wage rates paid by the fire and police departments of cities of over 25,000 popula- 
tion have been collected and tabulated. 

Agents of the Bureau are at present in the field collecting wage data for the furni- 
ture, airplane, and cement manufacturing industries. 

The Bureau has recently completed a study of homes for the aged in the 
United States, the results of which are embodied in Bulletin 489 now in press. This 
bulletin also brings together data on other forms of care of the aged, collected in pre- 
vious studies. A directory of homes for the aged will be published as Bulletin 505. 

The semiannual cost-of-living survey wi!l be started December 1. 


United States Children’s Bureau.—Annual figures on employment certificates 
issued to children 14 and 15 years and 16 and 17 years of age in occupations for which 
certificates are required by state law have been summarized for 16 states, the District 
of Columbia and 51 cities with a population of 50,000 or more in 15 other states. The 
summary shows that 102,934 children of 14 and 15 years and 47,335 children 
of 16 and 17 years received regular certificates for the first time in 1928. A brief 
summary of the facts regarding industry and occupation entered, school grade com- 
pleted, and evidence of age accepted for certification are shown in the Annual Report 
of the Chief of the Bureau. A full report will be issued in bulletin form. 

The number of juvenile courts coéperating with the Children’s Bureau in the plan 
for uniform recording of juvenile court statistics increased from 43 in 1927 to 65 in 
1928. The tabulation prepared in the Bureau shows 38,882 delinquency cases, 16,- 
289 dependency and neglect cases, and 10,429 cases of children brought into court for 
discharge from probation or supervision, in the 65 courts coédperating during the year. 
General facts with regard to these cases appear in the annual report of the Chief of the 
Bureau. The full report, now in press, shows such facts as reason for reference to 
court, place of care pending hearing, and disposition of case as related to sex, color, 
age and nationality of child. 

Reports recently issued by the Bureau are: Bulletin 192, Child Labor in New Jersey, 
Part 1, Employment of School Children, and Publication 194, The Promotion of the 
Welfare and Hygiene of Maternity and Infancy for the year ended June 30, 1928. 


Payroll Statistics of the New York City Government.—With the thought of promot- 
ing interest in an index of public employment, the Committee on Governmental La- 
bor Statistics of the American Statistical Association has recently begun a study of the 
payroll statistics of the New York City government. 

Although progress has been made in the measurement of employment afforded by 
private employers, there has been no attempt to measure governmental employment, 
so far as can be learned. As the United States has been the pioneer in developing 
payroll statistics for industry at large, it is fitting that the first experiments in evolving 
indexes of public employment should be made in this country. Moreover, the sub- 
ject is of present concern because of President Hoover’s indorsement of the principle 
that public authorities should expand and contract their employment so as to com- 
pensate for dullness or high activity in private industry. 

Last winter, when the Welfare Council of New York City was trying to arouse inter- 
est in the increased unemployment, the Committee suggested the need of some method 














i- 





85] Notes 433 


















































of measuring employment afforded by the city itself. A committee on unemployment 
which had been established by the Welfare Council later submitted to Mayor Walker 
a report which included a recommendation on this subject. 

With this support, the Committee on Governmental Labor Statistics has been able 
to begin work on an index of the numbers and earnings of the employees of the New 
York City government. It was discovered that the records did not indicate the 
numbers of employees covered by the various payrolls. At the Committee’s request, 
the city government promptly issued orders to the various departments that these 
data should be reported. 

The work on the index has necessitated examination of some 2,000 payrolls. The 
procedure is now fairly well formulated, and it seems probable that in the course of the 
next few months a regular system of reporting the employment and earnings of the 
employees of the New York City government will be instituted. 

The Committee is now addressing itself to the work of recording employment 
afforded by the city through contractors, a task involving many difficulties. 


Research Associates of the National Bureau of Economic Research.—The Directors 
of the National Bureau of Economic Research will appoint three Research Associates 
for the academic year 1930-31. The purpose of these appointments is to provide 
mature workers with facilities for the conduct of quantitative research in economics. 
They are not intended to aid persons working for higher degrees. 

Each Research Associate will receive a stipend of $3,600 per year, plus the expenses 
of the round trip between his home and New York. Research Associates will be in 
residence at New York during eleven months of the year beginning September 15, 
1930. 

It is desirable that candidates for appointment as Research Associates have definite 
research projects under way at the date of application, and that these projects should 
have reached such a stage that completion within a period of one year may ordinarily 
be expected. Research projects proposed by candidates may fall in fields now culti- 
vated by members of the staff of the National Bureau, or may relate to subjects not 
hitherto covered in the work of the National Bureau. It is assumed, however, that 
the work of Research Associates will deal primarily with the quantitative aspects of 
economic problems. 

Publication rights to the results of studies conducted by Research Associates will 
be reserved to the National Bureau of Economic Research. 

Appointment of Research Associates will be made upon recommendation of a 
Committee of six, five to be chosen from the group of university representatives on 
the Board of Directors of the National Bureau, the sixth to be appointed by the Social 
Science Research Council. 

Applications for appointment should be submitted to the Directors of the National 
Bureau of Economic Research, 51 Madison Avenue, New York City, not later than 
February 1, 1930. Those interested should communicate with the Directors for 
further information. 


New Statistics Section of the National Safety Council—On Tuesday, October 1, 
1929, at the eighteenth annual meeting of the National Safety Council, Chicago, the 
newly constituted statistics section of the Council conducted its first regular session. 
Dr. Louis I. Dublin is the General Chairman of the Section and Mr. R. L. Forney is 
Secretary. The morning meeting, under the chairmanship of Mr. Leon Aronowitz 
(New York State Bureau of Motor Vehicles), included papers by Mr. W. W. Mat- 








434 American Statistical Association [86 


thews, Miss Ethel Usher and Mr. W. C. Brent on practical problems in the organiza- 
tion of state and city statistical services. The afternoon meeting, under the chair- 
manship of Dr. Dublin, was a unit session on “Yardsticks of Safety.”” Papers were 
presented by E. W. Kopf on the classification of accidents in general; W. Thurber 
Fales on non-traffic accidents; Fred Rosseland on motor vehicle accidents; E. §, 
Fallow on home accidents; David Beyer on industrial accidents; and Lewis S. DeBlois 
on criteria for the increase or decline in industrial accidents. 

In 1929, the Section’s Committee on Supplementary Reports of Accidental Causes 
of Death, E. W. Kopf, Chairman, conducted inquiries with two committees of the 
American Public Health Association in respect to the revision of the accident titles in 
the International List of Causes of Death. The revised titles will be submitted to the 
International Commission for the Revision of the International List of Causes of 
Death, Paris, October, 1929. 

Membership in the new section is open to any person employed in the preparation 
and publication of accident statistics. Mr. R. L. Forney, 108 East Ohio Street, 
Chicago, Illinois, or Mr. W. W. Matthews, State Bureau of Motor Vehicles, Harris- 
burg, Pennsylvania, will be glad to welcome new members. 


The Brookings Institution.—Recent additions to the staff of the Institute of Eco- 
nomics are Frieda Baird, William H. Young, Margaret Schoenfeld, and Knute 
Bjorka. Sumner H. Slichter, of Cornell University, has returned to the Institute ona 
year’s leave of absence to complete a study of the Influence of Trade Unions on Pro- 
duction. The following have resigned from the staff of the Institute of Economics: 
Joseph G. Knapp, to accept a position as Associate Agricultural Economist at North 


Carolina State College; Jurgen Kuczynski, to edit Finanzpolitische Korrespondenz; 
Frank Tannenbaum, to complete his study of the life of Thomas Mott Osborne under 
a grant from the Social Science Research Council. Isador Lubin is spending several 
months in Europe making a study of the radio industry. Leverett S. Lyon has re- 
turned from Amsterdam, where he acted as a delegate for the United States Govern- 
ment at the International Congress on Commercial Education. Lewis L. Lorwin 
is in attendance at the Conference of the Institute of Pacific Relations in Kyoto. 

Henry P. Seidemann and Taylor G. Addison, of the Institute for Government Re- 
search, are in Santo Domingo supervising the establishment of the new system of 
financial administration recommended by the recent Dawes Commission. Laurence 
Schmeckebier has been appointed supervisor of the Census of Indian Reservations for 
the Fifteenth Decennial Census, 1930. 

The newly organized Publications Division will publish and distribute all of the 
books and pamphlets of the Institution. Recent publications of the Institution 
include: The St. Lawrence Navigation and Power Project, by Moulton, Morgan, and 
Lee; The Tariff on Iron and Steel, by Berglund and Wright; Unemployment Insurance 
in Germany, by Carroll; and The Principles of Judicial Administration, by Willoughby. 


PERSONAL NOTES 


A. M. Carr-Saunders of Liverpool, author of The Population Problem, and Professor 
Corrado Gini, President of the Central Institute of Statistics in Rome, are to be in 
residence during the spring quarter, 1930, at the University of Minnesota, as visiting 
professors of Sociology. 


Mr. John C. Clendenin, Director of Research at the Los Angeles Stock Exchange, 
is giving a course in statistical methods in the Educational Institute of that Exchange. 





87] Notes 435 


Although the Stock Exchange Institute has just been started, over 1,400 students 
have been enrolled in the various courses, according to Dr. Gordon S. Watkins, 
Director. The courses offered include Brokerage Practice, Investments, Brokerage 
Accounting and Corporation Finance. 


Mr. Alan E. Treloar, formerly Farrer Memorial Scholar of Sydney, Australia, and 
for the past two years International Education Board Fellow, and Miss Borghild 
Gunstad, formerly Assistant in the Department of Mathematics, University of 
Minnesota, have been added to the biometric staff of the Department of Botany, 
University of Minnesota. 


MEMBERS ADDED SINCE SEPTEMBER, 1929 


Bennit, Mrs. Helen L., U. S. Bureau of Mines, Washington, D. C. 

Bohannan, Charles D., Graduate School, Columbia University, New York City 

Borgedal, Dr. Paul, Professor of Farm Economics, Landbrukshoiskolen i As, Norway 

Royer, Samuel A., Commonwealth Securities, Inc., 520 Cuyahoga Building, Cleve- 
land, Ohio 

Breen, William J., 255 West 20 Street, New York City 

Brooding, Milton E., Statistical Department, California Packing Corporation, 101 
California Street, San Francisco, Calif. 

Brumley, Frank W., Department of Agricultural Economics and Farm Management, 
University of Florida, Gainesville, Fla. 

Bryant, H. F., U. S. Department of Agriculture, 522 Custom House, Louisville, Ky. 

Choate, Marjorie S., National Industrial Conference Board, 247 Park Avenue, New 
York City 

Church, Verne H., U. S. Bureau of Agricultural Economics, 730 State Office Building, 
Lansing, Mich. 

Conway, Herman M., National Live Stock Producers Association, 608 South Dear- 
born Street, Chicago, Il. 

Cureton, Edward E., Territorial Normal and Training School, Honolulu, Hawaii 

Davis, R. E., Goodyear Tire & Rubber Company, Akron, Ohio 

Forman, H. Irving, c/o Boris Kublanov, 113 West 23 Street, New York City 

Gans, A. R., 124 Linden Avenue, Ithaca, N. Y. 

Hady, Frank T., Department of Farm Economics, State College, Brookings, S. D. 

Heggie, Helen E., Institute for Research in Land Economics and Public Utilities, 
Northwestern University, Chicago, Il. 

Kelley, Pearce C., School of Business Administration, University of California, 
Berkeley, Calif. 

LeVita, Maurice H., Fidelity Mutual Life Insurance Company, Philadelphia, Pa. 

McMichael, Nellie, Department of Labor, Washington, D. C. 

Malkary, Henry J., Douglas W. Clinch Company, Investment Bankers, 72 Wall 
Street, New York City 

Mallory, Walter S., 32 West 40 Street, New York City 

Marks, H. A., Bureau of Agricultural Economics, U. 8. Department of Agriculture, 
Box 273, Orlando, Fla. 

Meek, Howard B., Cornell University, Ithaca, N. Y. 

Mendel, Anne P., International Acceptance Bank, Inc., 52 Cedar Street, New York 
City 

















































436 American Statistical Association [88 


Neubert, Hedwig, Pullman State College, Pullman, Wash. 

Newman, George J., Bureau of Foreign and Domestic Commerce, Washington, D. C. 

Nolloff, Janaki S., Professor of Farm Management, Faculty of Agriculture, Sofia, 
Bulgaria 

Norton, Professor L. J., University of Illinois, Urbana, III. 

Nystrom, Dr. Paul H., Columbia University, New York City 

Parr, George E., New England Power Association, Room 1308, 89 Broad Street, 
Boston, Mass. 

Parsons, Herman W., Jr., Bamberger Brothers, 39 Broadway, New York City 

Perryman, Herbert A., Los Angeles Railway Corporation, Los Angeles, Calif. 

Powlison, Dr. Keith, Claremont Colleges, Claremont, Calif. 

Sanders, Barkev S., Department of Sociology, Columbia University, New York City 

Sherman, Joseph V., National Newark and Essex Banking Company, Broad and 
Clinton Streets, Newark, N. J. 

Smith, Isabella B., Brookmire Economic Service, Inc., 551 Fifth Avenue, New York 
City 

Solow, Helen E., Lambert and Feasley, Advertising Agency, 17 East 49 Street, 
New York City 

Stebbins, A. M., Pacific Mills, 24 Thomas Street, New York City 

Stoker, Herman M., Cornell University, Ithaca, N. Y. 

Tanchin, William, Y. M. C. A. Hotel, 826 South Wabash Avenue, Chicago, III. 

Timoshenko, Dr. Vladimir P., School of Business Administration, University of 
Michigan, Ann Arbor, Mich. 

Trafton, George H., 76 North Main Street, Leominster, Mass. 

Troescher, Tassilo, Farm Management Research, Bernburgerstrasse No. 14, Berlin, 
S. W. 11, Germany 

Tyler, Kathryn D., University of Michigan, Ann Arbor, Mich. 

Urich, John E., Tabulating Machine Company, 50 Broad Street, New York City 

Vandensteen, Nicholas P. F., Hyatt Roller Bearing Division of General Motors, 
Harrison, N. J. 

Waugh, Professor Albert E., Connecticut Agricultural College, Storrs, Conn. 

Werber, Herman A., Associated Gas and Electric Corporation, 33 Liberty Street, 
New York City 

Weiss, Bertram E., Columbia University, New York City 

Wilcoxen, Lewis C., City Engineer’s Office, Detroit, Mich. 

Wilharm, Fred C., Homewood Station, Pittsburgh, Pa. 

Wilson, Robert M., Fruit Dispatch Company, 700 Prospect Avenue, Cleveland, 
Ohio 

Woodside, Byron D., Economic Division, Federal Trade Commission, Washington, 
D. C. 

Wright, John W., U. S. Department of Agriculture, 205 Cotton Exchange Building, 

El Paso, Tex. 














— =—— «4 te SS — 


ty 
nd 


rk 








colleagues and students at Cornell. 





Reviews 


REVIEWS 


Interrelationships of Supply and Price, by G. F. Warren and F. A. Pearson. Cor- 
nell University Agricultural Experiment Station, Bulletin 466, March 1928. 
144 pp. 

This publication includes in its 140 closely-packed pages of text, charts, and 
tables, an extraordinary amount of statistical information bearing on the rela- 
tions between supply and price of agricultural commodities and, for purposes of 
comparison, some similar material for industrial products. Statisticians and 
economists will find it an amazingly rich source of such material, far surpassing, 
in this respect, any other publication in this field. 

The arrangement in the body of the work is by commodities. The treatment 
of each commodity follows similar lines, in so far as the data and the character of 
the significant relationships permit. The sub-headings under which the first 
commodity, potatoes, is treated may be listed as representative. They are: 
Effect of supply on prices; effect of deflation on price relationships; increasing 
violence of farm-price fluctuations; effect of supply on total value; relation 
between farm, wholesale, and retail prices; effect of size of crop on relationships of 
farm, wholesale, and retail prices; and effect of prices on acres of potatoes planted. 
Additional commodities treated in considerable detail are: Hay, cabbage, corn, 
oats, wheat, hogs, beef cattle, and horses. Ten other agricultural commodities 
are treated briefly (sweet potatoes, apples, bananas, peaches, pears, cranberries, 
barley, buckwheat, rye, and rice), as are also four industrial products (pig iron, 
phosphate rock, zinc, and salt). 

The text is followed by a text table and an appendix table showing the calcu- 
lated relationship between supply and concurrent price for 25 different commodi- 
ties treated separately and for six groups of commodities, treated as groups. In 
the text table is shown the average price, as a percentage of normal, accompany- 
ing various levels of production, expressed likewise as percentages of normal. In 
the appendix table are given the calculated equations of relationship on which 
are based the data in the text table. Each table shows, for most of the commodi- 
ties, relationships calculated from several different price series and for a few 
commodities relationships calculated for more than one production series, bring- 
ing the total number of calculated relationships shown in each of these tables 
to 221. 

The appendix includes also a list of sources of data employed, a summary of 
salient points brought out by other students of supply-price relationships,' a 
bibliography of publications on supply-price relationships, an argument for 
general use to represent supply-price relationships of equations of the form 


B 
Y=—+4C 
xet 


’ Relationships presented elsewhere in the study cover only those worked out by the authors and their 





















438 American Statistical Association [90 


in which C may generally be assumed equal to zero, and a description of the 
methods employed in calculating the supply-price relationships shown in the 
accompanying text and tables. 

As the foregoing summary suggests, this work presents the appearance of a 
handbook of information on supply-price relationships, suggesting comparison 
with handbooks of engineering formulae. The extreme brevity of the discus- 
sions of relationships stated and the limited space given to general economic 
analysis support this interpretation. Viewed as a handbook of information on 
factors determining prices of commodities, however, the work must be disap- 
pointing to one who would use it for purposes of price forecasting. There is 
virtually no reference to the numerous factors apart from varying supply (and 
its distribution) that determine price, and no effort to put the data into con- 
venient form for forecasting purposes. 

An appraisal of this work as a handbook would, in the opinion of the present 
reviewer, do it great injustice. It is to be judged rather as a presentation of data 
leading to a series of important economic generalizations. To gain a true appre- 
ciation of the significance of the work it must be read from cover to cover. Its 
handbook form is unfortunate, on this interpretation, for it discourages consecu- 
tive reading. 

Nine general conclusions appear to stand out as most important in the minds 
of the authors. Without attempting to indicate the relative significance at- 
tached to each by the authors, these conclusions may be summarized as follows:? 

(1) “Elasticities of demand” calculated from relations between supply and 
wholesale prices greatly understate the elasticities of actual consump- 
tion demands and at the same time give an entirely false idea of the 
extent to which farm prices change with changes in production. 

(2) ‘‘When there is a large crop, the farm price is reduced more cents per 
bushel than is the retail price. It costs more cents per bushel to get the 
cheap crop to the consumer than to get the high-priced crop to him’ 
(p. 144). 

(3) “Over a series of years, there has been a general tendency for an in- 
crease in the violence of farm-price fluctuations. A crop that is 10 per 


1 It must be admitted that other reviewers might list the nine principal conclusions somewhat differ- 
ently and the authors might not entirely subscribe to any such list. Conclusions are stated dogmati- 
cally and are to be found on almost every page. The seventeen paragraphs of the closing summary 
state with equal emphasis broad generalizations that rest on a great mass of evidence presented on 
previous pages, conclusions undoubtedly sound, but for which no evidence has been submitted, and mere 
statements of isolated facts. One’s judgment of the authors’ intended emphasis must rest largely on the 
amount of material assembled to support the conclusions and the number of times a statement is re- 
peated. 

It may be added that several other generalizations interested the present reviewer quite as much as 
some of those here listed: for example the statement (p. 27) that ‘the higher the percentage of the crop 
marketed, the more violent are the farm-price fluctuations.’’ In the absence of further development of 
the points, however, such observations could not be classed among the generalizations emphasized by 
the authors. 

2 I have endeavored in so far as possible to give these conclusions in the words of the authors’ state 
ment in their summary (pp. 143-44). Five of the nine conclusions are quoted from this summary 
Three of the other conclusions are reflected in statements in the summary but less completely or less 
concisely so that it seemed better to quote one of several similar statements appearing in the text (No. 3) 
or to state them in my own words. The conclusion I have listed fifth is not hinted at in the summary 
but is strongly emphasized at several points in the text. 








91] 


(4 


(9, 


Th 
whol 
of pr 
appe 
tude 
show 
of ho 
cent : 
norm 
weigh 
tirely 
a o-p 
some 
the ex 
the d; 
20 pe 
above 
farm 
flect 1 

Ape 
elasti 
that 1 
curve: 


' See, 
in the ¢ 
? Sucl 
instead 














Reviews 439 





cent above or below normal causes a greater change in price than 
formerly” (p.15). This general tendency toward increasing violence 
of farm price fluctuations has been modified by the fact that “when 
financial inflation occurs . . . farm price fluctuations are . . . less 
than normal ... when deflation occurs... aslight percentage 
change in retail prices makes a violent change in farm prices” (p. 13). 
(4) “There is no ‘world market’ to which other markets have a constant 
relationship” (p. 144). 
(5) “Location of supply is extremely important [in its effect] on price” 
(p. 29). 
(6) “Consumers pay more for a large crop than for a small crop. Farmers 
receive less total dollars for a large crop than for a small one’”’ (p. 143). 
(7) “The agricultural depression is primarily due to high handling charges 
which have resulted from deflation. . . . Retail prices of food in the 

United States do not indicate that there is an oversupply or an under- 

demand” (p. 143). 

In many respects the farmer faces more difficulties than industry in 

adjusting production to prices (pp. 102-105). 

(9) “Farmers respond to prices as vigorously as does industry, but they are 

dealing with biological facts” (p. 144). 

The fact that demand curves calculated from statistics of supplies and of 
wholesale prices must understate the elasticity of demand conceived of in terms 
of prices to consumers has been noted by others,! but nowhere else has there 
appeared such a thorough demonstration of the fact, and especially of the magni- 
tude of the understatement, as is given in the work under review. The authors 
show for example that over the period 1889-90 to 1913-14 an increase in supply 
of hogs such that the number of pounds packed in western markets was 20 per 
cent above normal was accompanied by western market prices 15 per cent below 
normal, but by retail prices of lard, bacon, ham, and pork chops (combined in a 
weighted average) only 5 per cent below normal (p. 72). These facts do not en- 
tirely justify the statement (p. 78) that “in the twenty-five years before the war, 
a 5-per-cent drop in the retail prices of pork led consumers in this country or in 
some other country to dispose of 20 per cent more hogs,” * but in general tenor 
the conclusions are undoubtedly sound. Another illustration may be taken from 
the data on potatoes in which it is shown that over the period 1895-1915 a crop 
20 per cent below normal was accompanied by New York retail prices 6 per cent 
above normal, by New York wholesale prices 38 per cent above normal, and by 
farm prices at Batavia, New York, 54 per cent above normal. These figures re- 
flect widely different elasticities of demand. 

Apology is perhaps due the authors for referring to their conclusions in terms of 
elasticity of demand, a form of expression they rigorously avoid. They deny 
that the “supply-price” curves they derive may properly be called demand 
curves, arguing that ‘‘a demand curve, of course, must be the relation of con- 


' See, for example, the present reviewer's paper on “ The Statistical Determination of Demand Curves” 
in the Quarterly Journal of Economics, XX XIX, pp. 503-43, August 1925. 

* Such a statement should be based on the regression of supply (or better, of consumption) on price 
instead of the regression of price on supply. 


(8 





















































440 American Statistical Association (92 


sumers’ prices to consumers’ supply.”” Economists might do better to use the 
term “‘demand curve”’ only in this restricted sense and so avoid some of the 
temptation of assuming that curves representing the relations between amounts 
consumed and prices at different stages in the distributive process show similar 
elasticities. The fact remains that it is the almost universal custom among 
economists and others when discussing most concrete demand problems (as con- 
trasted with the artificially simplified problems of the introductory portions of 
economic texts) to conceive of demand in terms of quantities consumed in rela- 
tion to wholesale prices. In these circumstances we prefer to restate the con- 
clusion of Warren and Pearson and to say that elasticity of demand for any 
product is usually much greater when expressed in terms of prices to the consumer 
than when expressed in terms of wholesale prices. This is a point of much sig- 
nificance for economic theory. 

The fact that elasticity of demand is greater when measured in terms of prices 
to the consumer than when measured in terms of prices at earlier stages in the 
distributive process rests largely on the simple fact that a 10-cent change in price 
(for example) represents a smaller percentage of a retail price of $2.00 than of a 
wholesale price of $1.50 or of a farm price of $1.00. In part, however, it may 
rest also on the facts next to be discussed. 

The conclusion we have listed second—that ‘it costs more cents per bushel 
to get the cheap crop to the consumer than to get the high-priced crop to 
him’’—is stated more broadly and dogmatically than the evidence seems to 
warrant. Though the tendency be much less general than the statement im- 
plies, the evidence is sufficient to demonstrate that it is an important tendency. 
It is shown, beyond any reasonable doubt, that a decline in prices of potatoes, at 
wholesale, is not reflected to consumers who purchase at hotels and that a de- 
cline in corn prices is only partly reflected in retail prices of corn meal. Similar 
conclusions relative to spreads between prices at different stages in the marketing 
process for potatoes, cabbage, and wheat are open to question. The conclusions 
of the authors are based on average spreads for years of large and years of small 
crops. The differences for potatoes and cabbage are small and have large prob- 
able errors. When the data (from Table 8 and Table 30) are studied in chart 
form, they lead the present reviewer to the conclusion that the spreads tend to 
narrow rather than to widen in years of large crops and low prices. The data 
cited on wheat prices appear to the present reviewer to reflect changes in spreads 
between different classes and qualities of wheat more than changes in marketing 
charges.‘ In many cases the authors’ conclusions rest on the questionable 
assumption that farm prices in Rhode Island and in Georgia may be regarded as 
essentially retail prices. One of the most convincing bodies of data presented in 
this connection really bears on a different, though perhaps no less important, 
point. It is shown that in each year from 1903 to 1914, sales of Nebraska horses 
in Vermont showed a much wider margin between purchase and sale prices on 
horses sold at low prices than on horses sold at high prices (pp. 100-101). A 
similar tendency toward wider margins on lower-priced grades than on high- 


1 This judgment is based on extensive studies of wheat prices at the Food Research Institute, which 
unfortunately cannot be summarized here for lack of space. 








in | 
abl 
suy 
is § 
(ch 
] 
the 
sun 
the 
tha 
CTO} 
cou 
is 11 
rece 
pric 
1 
hig! 
larg 
trib 
in h 
carr 
enti 
attr 
char 
viev 
is n 
dise 
cont 
joine 
the ; 
indic 
since 
effici 
cone 
the ¢ 
cultu 
incid 
and { 


Fo 
1 Rey 
? Th 
Farm ] 
in the ; 








~~ wove WS 2B OP Sh hUhUmhr 


eS OS + 








Reviews 441 





93] 


priced grades has been shown for wheat by the studies of the Federal Trade 
Commission." 

The conclusion that there has been a progressive increase over a period of years 
in the violence of farm price fluctuations, for given changes in supply might prob- 
ably be broadened to cover wholesale prices also. It receives its chief statistical 
support from data on potato prices and on corn prices. The statistical evidence 
is strongly supported by a critical economic analysis of the factors involved 
(chiefly on pp. 15-18). 

In the conclusion we have listed sixth the authors would be on safer ground if 
the first part of the statement were worded: A large crop is worth more at con- 
sumers’ prices than a small crop. It is not so clear that consumers pay more for 
the potatoes or cabbage or flour that they actually buy in years of large crops 
than in years of small crops. That farmers receive less total dollars for a large 
crop than for a small one is clearly demonstrated as generally true, provided, of 
course, one refers to receipts for at least the major portion of the production that 
is influential in determining the price; the statement is not intended to apply to 
receipts for a large or small crop in an area too restricted to influence greatly the 
price obtained. 

The important conclusion that “the agricultural depression is primarily due to 
high handling charges which have resulted from deflation” is supported by a 
large volume of data, among which is a most interesting index of the cost of dis- 
tribution of farm products (pp. 96-99). One may question whether the increase 
in handling charges is to be attributed entirely to the deflation, for this conclusion 
carries with it the doctrine that the present high level of wages is to be ascribed 
entirely to the deflation, but it seems clear that the authors are amply justified in 
attributing the agricultural depression in large measure to the increase in handling 
charges, which in turn was at least facilitated by the deflation. The present re- 
viewer cannot concur, however, in the opinion that oversupply or under-demand 
is not a factor in the present agricultural situation.? It is difficult to prevent a 
discussion of alleged oversupply or under-demand from degenerating into a 
controversy over definitions or an argument in which issues are nowhere really 
joined. Suffice it to say that where the authors base their conclusion chiefly on 
the ground that retail prices of food are now on a level with the cost of living, as 
indicated by generally accepted index numbers, the present reviewer holds that 
since other items in the cost of living have benefited more from increases in 
efficiency of production than have the products of agriculture, the authors’ 
conclusion would be supported only if retail prices of food had risen more than 
the general cost of living index; and that the present reviewer holds that agri- 
culture is still suffering from the curtailment in demand for agricultural products 
incident to the extensive replacement of the horse by the automobile and truck 
and from some less important changes in demand for foods and textiles. 

Ho.Brook WORKING 

Food Research Institute 


1 Report on the Grain Trade, Vol. I, pp. 191-96. 
? The position of the authors is more fully stated in a paper by the senior author in the Journal of 
Farm Economies, Vol. X, No. 1, pp. 1-15. He indicates here the opinion that oversupply was a factor 
in the agricultural depression in the United States from 1921 to 1924, but not subsequently. 





























































442 American Statistical Association [94 


Trends in Philanthropy, by Willford I. King, assisted by Kate E. Huntley. New 
York: National Bureau of Economic Research, Inc. 1928. 78 pp. 


This book presents the results of an attempt to determine trends and fluctua- 
tions in receipts and expenditures of all philanthropic agencies in New Haven, 
Connecticut, over the quarter century, 1900 to 1925. The study was made at 
the request of the Carnegie Corporation, as a preliminary survey to find how far 
answers could be obtained to questions concerning the extent and development of 
philanthropic giving in the United States. It is the first study in which an en- 
tire community has been subjected to this type of analysis and its findings, al- 
though they relate to one community only, are, therefore, of great interest. 

Philanthropy here includes all religious organizations, hospitals, and other 
health and social welfare agencies, public and private. Education is excluded, 
except as it is part of the activity of some of these agencies. For 1900, 144 organi- 
zations required study and for 1925, 223 organizations. Six field workers were 
engaged four months, and two longer, in transcribing the agencies’ data. It is re- 
markable that they found “reasonably complete records of totals and their 
apportionment” for half of the organizations, representing 78 per cent of the 
estimated aggregate income, at the beginning of the period, and for three-fourths 
of the organizations, representing 92 per cent of aggregate income, at the end. 
Only a few religious organizations refused information. Unfortunately, much 
estimating was necessary to compiete the data. The missing figures were sup- 
plied by assuming trends and fluctuations like those found for similar organiza- 
tions. 

The report is brief and clearly written. Precision is not claimed for the data, 
but they are presumed to give general impressions which are reliable. Out- 
standing conclusions are: 


Philanthropic contributions both in actual amounts and when reduced to con- 
stant purchasing power show large growth over the quarter century. They have 
remained, however, in about the same ratio to total wealth, and the contribution 
in constant purchasing power per capita of population has remained about the 
same. 

Personal contributions have shown marked cyclical fluctuations, with inter- 
vals of about four years, which are not correlated with cycles in business activity 
Expenditures, on the other hand, have undergone regular increase. The cycles 
in contributions pertain to the contributions of living persons to secular charities. 
They are not found in contributions to religious organizations, in ‘‘contributions” 
obtained through taxation, or in bequests. 

Earnings and income from investments represent a growing proportion of the 
total income of philanthropic organizations. Gifts of $100 or more are much more 
aenens See small gifts in financing their work. 

1p of private secular organizations have increased much more 
rapidly than those of governmental organizations. Religious organizations have 
lost ground as compared with both groups of secular organizations. 


The reader will wish for more and more exact functional classification of ex 
penditures. It would have been desirable to show trends which might reflect 
changes in the activities of some of the groups of organizations, as, for example, 
by segregating in the case of organizations giving outdoor relief expenditures for 
direct financial aid. But the character of the records available prevented doing 


this. 








re ee 





—_ 


ti 
W 
ag 
tic 
ot 


tel 
pre 


til 
thi 


Cor 
] 


I 
rati 
ASS 
mus 
stuc 

T 
defi 
is th 
of a 
who 
whic 
Cha 

In 
on ¥ 
is pa 
have 








ua- 
en, 


far 
; of 
en- 


her 


on- 
uve 
ion 
the 


er- 
ty 


ies. 
is 


the 
ore 


ore 
ave 





95] Reviews 443 


In comparing trends for various classes of organizations, it is shown that, rel- 
atively, expenditures for health work have increased and those for religious 
work have decreased most conspicuously. The conclusion is drawn that expendi- 
tures for health work have been steadily encroaching on those for religious work. 
While this is evidently true of total direct expenditures, the point might have been 
made that, if expenditures representing philanthropic work only were compared, 
the increase in expenditures for health work, both actual and relative, would be 
very greatly reduced. The data on income show that that part of the ex- 
penditures of hospitals which represents services paid for causes the conspicuous 
increase in expenditures of the health work group of agencies. If expenditures 
for philanthropic work only are compared, health work has not encroached on re- 
ligious work more than other types of welfare work. 

The study presents a method which should be applied in many other communi- 
ties. Already in New York a study modeied on this is nearing completion. 
With Community Funds in more than a dozen cities now raising for their member 
agencies alone at least a million dollars annually, analysis of trends and fluctua- 
tions in philanthropy becomes highly important. Repetition of this study in 
other communities is needed to test the general applicability of its findings. 

It is to be regretted that it was not possible to study in New Haven, as in- 
tended, the cost of raising money for philanthropic purposes. On this point com- 
prehensive information has long been due the public. It is desired alike by those 
who give and those who spend the money. But apparently it cannot be had un- 
til an improved and standardized system of accounting is adopted by philan- 
thropic agencies. 





Ratpu G. Hurwin 
Russell Sage Foundation 


Corporation Profits, by Laurence H. Sloan. New York: Harper and Brothers. 

1929. 365 pp. 

In this book the author presents some statistics of the larger industrial corpo- 
rations and some criticisms of the present contents of corporate reports. We may 
assume that the statistics are the best of the kind readily available, but if so we 
must recognize how unsafe it is to draw inferences from them without a careful 
study of each individual company. 

The author points to the fact that the data cover only two vears as the chief 
deficiency of the volume. A more important if less easily remedied deficiency 
is the lack of homogeneity in the material. This defect impairs the significance 
of all the statistics, most seriously, perhaps, where computations are based 
wholly or largely on book values of capital, as for instance in Chapter IV, in 
which percentages of depreciation and depreciation charges are discussed, and 
Chapter VII dealing with the percentage of earnings on invested capital. 

In any such discussions it is essential to understand how varied are the bases 
on which the capital assets of industrial corporations are carried. The point 
is particularly important just now because in the last quarter of a century we 
have had radical changes both in the practice relating to incorporation and in 













































444 American Statistical Association [96 


price levels. Capital assets may be stated on the basis of cost or on the basis of 
a valuation. It may be a pre-war or a post-war basis. The cost may be a cost 
in cash or a cost in securities. If the latter, it may be a legal cost measured by 
a par value of a grossly inflated stock issue if the corporation was formed early 
in the century, or it may be greatly understated if the assets were acquired by a 
recent issue of stock without common value. 

A computation which ignores such differences can scarcely be regarded as more 
significant than one showing the average consumption of food by a group of 
animals in which mice, rabbits and elephants are included in undisclosed pro- 
portions. The difficulty of securing a really satisfactory grouping is undoubt- 
edly great, but the reader is at least entitled to a clear statement of the defective 
character of the material used. It may be that the author appreciates fully the 
varied character of his material, and has satisfied himself that after making due 
allowance therefor his conclusions are valid and significant. He does not, how- 
ever, succeed in creating such an impression. 

The criticisms of present practice and the suggestions for improvement are 
of a rather perfunctory and elementary character though put forward with an 
air of daring innovation. The form of ideal report suggested bears a strong 
resemblance to the standard forms set forth in the first general corporation act 
in England, that of 1862. 

The author reveals no understanding of the complexities of accounting in a 
large business corporation of today or of the philosophy of accounting. Those 
who hope to find in this work an important contribution to the solution of a 
question of great and growing importance will be disappointed. 

GrEoRGE O. May 


Supply of Electrical Equipment and Competitive Conditions. Senate Document 
No. 46, 70th Congress, Ist Session. Washington: United States Government 
Printing Office. 1928. 282 pp. 

This final (?) Report of the Federal Trade Commission relative to the Supply 
of Electrical Equipment and Competitive Conditions is primarily an interesting, 
though not conclusive, historical document. It contains a good deal of useful in- 
formation on the origin, development, and relative status of electrical manv- 
facturing companies in the United States. Particular attention is devoted to 
the General Electric Company and its domestic and foreign affiliations. 

It is quite obvious from the Report that no one company has any real monop- 
oly in the manufacture of electrical equipment. It should further be clear that 
relatively great size in the industry is due to unusual progress along technical 
lines, and unusual capacity for rendering real service in the central station field. 

A good deal of attention is given in the second part of the Report (“Compe 
tition in the Electric-Power Industry”’) to the National Electric Light Associa- 
tion and other organizations which tend to bring together electrical manufactures 
and consumers of electrical equipment. Probably this part of the Report might 
have been changed somewhat had it been written a year later, after the Con- 
gressional investigations of local Public Utility Information Bureaus and after 






















M 








3 of 
ost 


rly 
ya 


ore 
of 
r0- 
bt- 
ive 
the 
lue 
w= 


nt 


oe 








Reviews 445 





97] 
the recent publicity thrown on the attempts of a well known company to secure 
control of newspapers in order to make a ready market for paper and pulp. No 
doubt, also, if these investigations had been held a year or two later, consider- 
able attention might have been given to the rather spectacular pyramiding of 
so-called “investment trusts,” formed primarily to hold the stock of public 
utility holding companies. However, so far as the present Report is concerned, 
the Federal Trade Commission has apparently been unable to find much to crit- 
icize in the electric light and power industry. 

It is striking that the present report has omitted all reference to the decreas- 
ing price tendency in many electrical supplies and equipment, due to engineering 
improvements and the benefits of mass production. Mention might have been 
made of the fact that electric lamps are probably selling, on a relative efficiency 
basis, at not more than two-thirds the 1914 price, whereas manufactured commodities 
in general are selling from 35 per cent to 40 per cent above the pre-war price. Fur- 
ther, it might have been pointed out that, for the general line of electrical sup- 
plies sold by jobbers, the price has for many years been far below the general 
price level, as compared with pre-war. 

To anyone who has been intimately associated with the electrical industry, 
it seems rather futile to raise the question of “monopoly,” when the fact of the 
matter is that competition has always been unusually keen. Because of im- 
portant technical and engineering considerations, it has, of course, been necessary 
to agree to certain standards and specifications, and certain methods of cost 
accounting. Otherwise there could have been no progress in the electrical in- 
dustry. This policy has been fostered by the U. 8. Department of Commerce. 
Aside from this phase of the matter, however, each element in the industry has 
usually been quite free to plan its own procedure, to run its own risks, and sur- 
vive, if it could, in the competitive struggle. 

EpMonp E. LINCOLN 


The Determination of Secular Trends, by E. D. Mouzon, Jr. Urbana: University 
of Illinois, Bureau of Business Research, Bulletin No. 25. 1929. 71 pp. 


“The purpose of the bulletin is to outline that procedure (of fitting computed 
curves to data in question) for certain fundamental types of curves in such a way 
that curve fitting may more readily become a part of the practice of the statistical 
departments of business firms.” (Preface.) In pursuance of this purpose 
Dr. E. D. Mouzon, Jr., discusses the following topics: the nature of the secular 
trend; the preliminary steps (homogeneity of data, trend period, type of curve, 
plotting); the calculation of the trends (straight line, second degree parabola, 
both to natural data and to logarithms, modified exponential). Two appendices 
discuss some of the mathematical background of curve fitting and the essentials 
of logarithms. 

The reviewer is inclined to take emphatic exception to the methods of statistical 
education as propounded by this bulletin. The secular trend is defined by Dr 
Mouzon, and quite properly, as “an approximation to the general movement of 
the data over periods of greater duration than the short-time oscillations of 









































446 American Statistical Association [98 


seasonal and cycle” (p. 12). Are, then, the mathematical curves the only 
method of obtaining such an approximation? The bulletin does not mention 
moving averages, a most serviceable device used to “isolate the short-time 
movements of these data’”’ (p. 11). While the discussion of the nature of the 
trend dwells commendably on the relative character of the concept, such sophisti- 
cation is lacking when methods of describing secular trends are discussed. 

In regard to the types of curves, some statements sound rash, if not naive. “In 
general, the appearance of any trend is either that of a straight line, or that of a 
slightly convex or concave curve.” ‘The straight line isthe most useful . . . of 
all the curves which are employed to represent trend . . . because it represents, 
approximately, the general long-ti: 1e movement of the majority of time sequences” 
(p. 15). Of a number of production series and of most of the price series this 
statement is certainly not true, provided a period longer than twenty years 
is taken. 

The discussion of the methods of fitting is simple and circumstantial. But 
even here, should one present the least squares procedure without pointing out its 
dangers and limitations? There is no mention of alternative procedures, even 
when they are as simple as the method of semi-averages for a straight line. 

One wonders what specific useful purpose is served by this bulletin which is not 
attained more happily by a good statistical textbook, such as that of F. C. Mills 
or of R. E. Chaddock. The only virtue of the bulletin is the explicitness of 
its directions, which may be useful to computers innocent of statistical technique. 
Is it wise to entrust these beginners with a curve fitting technique, without cau- 
tioning them about some arbitrary features of the procedures, without pointing 
out the alternative methods, and leaving them to be guided by somewhat rash 
generalizations which can be true only of a limited body of statistical data? 

Srmon Kuznets 

National Bureau of Economic Research, Inc. 


Funeral Costs, by John C. Gebhart. New York: G. P. Putnam’s Sons. 1928. 

319 pp. 

This work is of more interest to the economist and the sociologist than to the 
statistician. It is concerned with an historical and analytical study of the fu- 
neral industry, undertaken at the instance of the Advisory Committee on Bur- 
ial Survey, of which Mr. Lawson Purdy, Director of the Charity Organization 
Society of New York, is chairman. Of particular value are Chapters IV and V, 
which are devoted to an analysis of funeral bills in New York, Chicago, and other 
American cities, and Chapters [IX and X, which are given over to a discussion 
of the economics of the burial industry. The examination of burial costs brings 
out two significant facts: (1) The lower income groups spend a far larger propor- 
tion of their resources on funerals than do those occupying the upper economic 
levels; (2) low-income families spend relatively little on ‘‘extras,’’ such as monu- 
ments, cemetery lots, and the like, while high income groups devote as much as 
50 per cent of their funeral expenditures to such items. In other words, the 
poor man’s funeral is given over chiefly to immediate show, while the obsequies of 














only 


tion 
ime 
the 
sti- 


‘In 
of a 
of 
its, 
oS 
his 
ars 


Sut 


its 


ot 
Ils 





Reviews 447 





99] 


the well-to-do take into account perpetuation of the family name and continued 
respect for the dead, as well as the ephemeral gesture of the funeral itself. 

The economic analysis constitutes an exceedingly valuable contribution to 
the study of American economic society. The funeral industry is unique in that 
its “demand curve’”’ is fixed absolutely by the death-rate, and, further, in that 
the decrease in mortality in this country has been so pronounced during recent 
decades that the demand for the goods and services supplied by the funeral in- 
dustry has remained virtually stationary, despite the rapid increase in popula- 
tion. The author traces the reaction of the industry to this situation, particu- 
larly the practice of pushing elaborate and costly commodities upon the public 
(“merchandising upward”), and the resort to wasteful competitive devices 
by manufacturers. Two practices that have had an especially baleful influence 
on the industry are the over-multiplication of retailers, and the extension of 
excessive credits and “‘services” to marginal and sub-marginal “broker type” 
undertakers. The author also points out that confusion, waste, and the crass- 
est sort of exploitation are encouraged by the fact that there is no means by 
which the consumer can appraise the worth of what he purchases, by the atmos- 
phere of emotion, hysteria, and compulsive social tradition in which the average 
bargain for funeral services is made, and by the tendency of funeral directors to 
base their charges mainly on goods (caskets, grave-vaults) when, as a matter of 
fact, the bulk of their expenses are absorbed in services. 

Constructive suggestions are made for the remedying of the existing unneces- 
sarily high cost, confusion and exploitation, chief emphasis being placed upon the 
elimination of uneconomic and unethical establishments and the recasting of the 
industry’s cost-keeping and pricing methods. It is noted that a beginning has 
already been made in bringing about these changes. 

The statistical methods employed in the study are simple and unpretentious, 
as befits the character of the data and the public addressed. The author is 
especially to be commended for giving a frank and circumstantial account of the 
preliminary steps taken in the gathering of his materials and of whatever am- 
biguities and imperfections are involved in them. 

The University of Buffalo eee Cee 


The Coéperative Pattern in Cotton, by Robert Hargrove Montgomery. New 

York: The Macmillan Company. 1929. xvi, 335 pp. 

Dr. Montgomery, who is professor of economics at the University of Texas 
and a native of the Cotton Belt, writes as a first-hand observer of the efforts of 
the Texas cotton growers to solve their marketing problems by large-scale co- 
operative organization. He shows a broad, objective understanding of the dy- 
hamic human sentiments and reasonings which have influenced the efforts of 
the growers to better their lot since the crisis of 1920. This book is a particularly 
valuable and timely contribution to the literature on codperative marketing 
because it deals with the subject intimately, concretely, and sympathetically, 
but without bias. 































448 American Statistical Association [100 


By a review of specific evidence he shows that the competitive system, as it has 
operated, has failed to secure a fair price and a living wage for many cotton 
farmers. The results of the free competitive system as manifested in the reali- 
ties of life in the Cotton Belt have been far from wholesome or satisfactory. 
Hardship and unrest became particularly acute after the crisis of 1920. 

After the foregoing introduction, Dr. Montgomery relates vividly how Mr. 
Aaron Sapiro, who had been casually invited to attend a growers’ convention 
in 1920, so enthused many growers by his dramatic oratory that he became the 
dominant influence in the formative period of the new Association. In subse- 
quent chapters the ill effects of some of Mr. Sapiro’s methods and theories are 
pointed out, but full credit also is given him for several especially worth-while 
contributions. 

A detailed account is presented of the methods of organizing and operating 
the Texas Farm Bureau Cotton Association. The expectations, difficulties, 
mistakes, disappointments and successes of the association during the first six 
years of its existence are set forth fairly and frankly. The association did not 
succeed in securing the support of anything like the number of farmers originally 
expected and it has not been able to avert discontent when prices dropped for 
reasons beyond its control. The association has gradually improved its operat- 
ing organization, however, into a more efficient business enterprise and it has 
effected certain notable savings in marketing costs, such as warehousing and in- 
surance. 

While he is dealing with the specific performances of the association, Dr. Mont- 
gomery keeps his feet firmly on the ground. In his last chapter, however, where 
he undertakes to look broadly into the future, he does not preserve the same 
balance. His view is that the codperative organization should provide “an 
acceptable standard of living for the people engaged in the industry,” by yielding 
“an adequate and regular money income.” He cites specific examples to show 
how thoroughly the industry has failed to achieve this end under the old con- 
petitive system. We all will grant whole-heartedly the worthiness of the ob- 
jective stated by Dr. Montgomery. We will grant that many conditions in the 
Cotton Belt at the present time are deplorable. We will grant that reliance on 
the old doctrine of competition has failed adequately to better conditions. 
Nevertheless, it is too much to ask that the codperative marketing association 
should do the whole job of effecting an industrial and social revolution. The 
coéperative association cannot regulate the weather nor offset the varying effects 
of weather on different localities. It cannot control the size of farms to secure 
economical operating units. It cannot control fashion, the Indian monsoon, or 
the competition of substitute materials, all of which have a real or at least a po- 
tential influence on the price of cotton. Reliance on monopolistic power, if it 
could be attained, would be no safer than reliance on free competition. 

A coéperative marketing association is primarily a business enterprise. As 
Dr. Montgomery explains, the Texas association has made mistakes and blun- 
ders, but it also has made substantial progress in learning how to handle the 
business of its members. Dr. Montgomery errs, it seems to me, in his final 
chapter in proposing that its ultimate goal should be the effecting of a broad so- 











ee EE 


~~ Cf thee 





has 
tton 
pali- 
ory. 


Mr. 
tion 
the 
pse- 
are 
hile 


me 


ing 
ow 
m- 
»b- 








101] Reviews 449 


cial and industrial revolution, rather than the perfecting of a far more effective 
business organization. Other organizations are needed for assistance in solving 
those other problems. 
Me tvin T. CopELAND 
Harvard University 


The Morris Plan of Industrial Banking, by Peter W. Herzog. Prize Monograph, 
Chicago Trust Company Prizes for Research Relating to The Financing of 
Business Enterprises. (1927 Award First Prize.) Chicago and New York: 
A.W.Shaw Company. 1928. 117 pp. 

Mr. Herzog’s book is trying to dofor the Morris Plan what Mr. Ziegfeld has done 
for the American girl. The book is largely an eulogy of the Morris Plan system. 
Mr. Herzog has done a particularly good work, it is well written, flows nicely and 
covers the Morris Plan in a very able way. I would like to have had the work 
amplified by the broader subject of industrial banking rather than restricted to 
an institutional history. 

Now that the commercial banks have definitely gone into the field of small 
loans at substantially lower rates than the Morris Plan, one wonders whether the 
book is not primarily of historical value. 

To quote from the Committee of Award: 

The Morris Plan has been successful—so successful, in fact, that it has become 
a notable financial enterprise with parent companies, “‘holding” companies, 
subsidiaries, and all the structural ramifications by which financial magnitude is 
now measured. It has gone much further than the lending of money on business 
terms to people not rated by the credit agencies. It has financed installment 
sales, set up a securities corporation, organized a plan of industrial insurance, 
= gr a general finance and acceptance business and, through the Morris 
Mee orporation of America, provided a rediscount market for Morris Plan 

anks. 

These developments Mr. Herzog has covered in his study, blending the state- 
ment with an interpretation which describes the place of the Morris Plan organi- 
zations in the economic scheme and measures their social significance. 

Joun G. Rope 


A Study of Interest Rates, by Karin Kock. London: P.S. King and Son. 1929. 

X, 252 pp. 

Perhaps the most puzzling and the least explored branch of economics is the 
theory of interest. The most fundamental question is seldom even asked. Why 
do interest rates vary from approximately 3 to approximately 8 per cent rather 
than from 1 to 3 per cent or from 20 to 30 per cent? It is easy enough to answer 
that the productivity of capital goods determines the demand for capital, that the 
willingness and ability of capitalists to postpone consumption make possible 
supply, and that the interest rate results from the supply and demand of capital. 
But this is a general statement and helps us really very little. A statistical 
analysis of the supply and demand of capital would throw much more light on the 
problem. Furthermore, the effect on the interest rate of such factors as the oper- 


















450 American Statistical Association [102 


ations of the credit-creating agencies of our modern economic organization should 
be determined by statistical analysis before a satisfactory theory of interest can 
be formulated. 

When the reader picks up Mr. Kock’s book, A Study of Interest Rates, he should 
not expect a broad comprehensive statement or solution of the problem. If he 
does, he will be disappointed. Mr. Kock, however, has done a valuable service. 
He has attempted to describe the various interest rates—long-time bond interest, 
commercial paper rates, call money rates, the official bank rates, etc.—and their 
inter-relations. Most of his statistical data are presented to describe the courses 
of these various rates. Mr. Kock’s greatest contribution, in the opinion of the 
reviewer, is his description of the effects of the banking systems of the United 
States, England, and Sweden on the various interest rates of these countries. 

Mr. Kock is impressed, as are all students working on the theory of interest, 
by the differences in the long- and short-time rates. His analysis will undoubt- 
edly help in formulating the multifarious reasons why the different short-term 
rates vary so widely at times and why they diverge from long-time rates. He 
would have contributed much more to the solution of the problem had he used 
daily, weekly, or monthly—rather than yearly—figures in his analyses of the in- 
ter-relation of the various rates. 

Long-time trends in short-term interest rates are much less pronounced than 
are such trends in bond yields. Mr. Kock’s data show how difficult it is to 
generalize about the interest rate, without specifying which interest rate, and how 
for considerable periods whatever trend there is in short-term interest rates shows 
no correlation with the long-time trend. 

Mr. Kock’s study would be more useful had he attempted to evaluate statisti- 
cally the forces operating on the interest rate. Instead Mr. Kock has contented 
himself in some measure with statements of professors so-and-so concerning the 
relative importance of these various forces. True, on the effects of the banking 
systems he has given us a mass of valuable evidence. Perhaps his next book 
will be more concise, will emphasize the more important factors, and will give 
more data for the solution of the most fundamental question in interest theory. 

KEMPER SIMPSON 





Population, Land Values and Government. Tegional Survey: Volume II, by 
Thomas Adams, ‘Harold M. Lewis and Theodore T. McCrosky. New York: 
Regional Plan of New York and Its Environs. 1929. 320 pp. 


This second volume of the New York Regional Survey is as comprehensive as 
its title indicates. In the subjects treated little has been overlooked that might 
be of value in the plan for the future development of the Region. The book 
itself is a work of art. The typography is unusually good and the text and tables 
are supplemented by a profusion of maps, charts and illustrations. 

On the side of analysis the work is somewhat less adequate, particularly in the 
part relating to population. Perhaps it is too much to expect that one volume 
should be both a compendium of information about the metropolis, and at the 
same time a key to its understanding. Mild criticism may also be directed at 








Il 


80 


ill 


tr 


bo’ 


ma 
the 
of 

are 
cer 
val 
po] 
cus 
dol 
Per 
tim 
hay 
ver 


par 
tre: 
of 








— Do © 


a a 





Reviews 451 





103] 


some of the graphs, especially those showing the composition and characteristics 
of the population, because of diagonal shading which produces annoying optical 
illusions. 

Part I is a study of the population growth of the metropolitan area of New 
York. Beginning with the economic basis of the concentration of population 
at the mouth of the Hudson, the history of urban development at this point is 
traced through to the present day. There are several maps of the New York 
Region showing population density and the limits of the built-up area at dif- 
ferent periods of the city’s history. Other maps show the influence of rapid 
transit lines on density and distribution of population. It is shown that nearly 
half the combined area of Brooklyn, Queens, The Bronx and Richmond consists 
of unbuilt land and that with proper distribution there is no need of congestion 
within the Region even with a population two or three times as great as at 
present. 

In an effort to arrive at the probable future growth of the Region, estimates 
were made by (1) Nelson P. Lewis of New York City, (2) Professors Pearl and 
Reed of Johns Hopkins University, (3) Professor Edwin B. Wilson and Mr. 
Willem J. Luyten of Harvard University, and (4) Mr. Ernest P. Goodrich of 
New York City. For the year 1965 these estimates vary from 14,510,000 
(Wilson and Luyten) to 21,067,000 (Lewis). The Pearl-Reed estimate was 
21,000,000. Goodrich by an entirely different method agrees closely with Wilson 
and Luyten for 1965, but his asymptote is 25,000,000 as compared with their 
16,667,000, and Pearl and Reed’s 34,900,000. Lewis estimates 34,698,000 in the 
year 2000 with no asymptote calculated. The differences between the Pearl- 
Reed and the Wilson-Luyten estimates are interesting in view of the fact that 
both estimates were made by the Pearl-Reed method. 

The discussion of land values is particularly illuminating. Effective use is 
made of maps and charts, showing the block frontage and acreage values over 
the entire Region. The concentration of land values is even greater than that 
of population, suggesting that wealth and the facilities for accumulating wealth 
are the dominant factors in determining the higher range of land values. Con- 
centration of population creates exchange value in land, but beyond this point 
values increase only with intensive use for business purposes. Congestion of 
population may even reduce land values. There is little comfort in this dis- 
cussion either for the land speculator or the single taxer. Measured by the 1913 
dollar, land values in New York City as a whole show no increase in 25 years. 
Perhaps it is more accurate to say that the entire increment of value within that 
time has been absorbed by increasing carrying charges. Owners of vacant land 
have suffered heavy losses, and in outlying sections thousands of lots have re- 
verted to the city or to local jurisdictions for non-payment of taxes. 

The influence on land values of rapid transit lines, high buildings, parks and 
parkways, street widening, bridges and public improvements generally, are well 
treated, and the conclusions reached have a very definite bearing on the drafting 
of any practicable plan for the future development of the Region. 

The chapters on Government give an admirable survey of the past and present 
political organization of the various subdivisions of the New York Region. The 






























































452 American Statistical Association 


[104 


final chapter takes up some phases of public finance bearing particularly upon the 
financing of the proposed regional plan. In conclusion Mr. Charles D. Norton 
is quoted to the effect that: ‘The money which will carry out the Plan of New 
York, is the money which New York will spend in any event, whether it has 
a plan or not. With a city plan, expenditures can proceed along permanent 
lines; without it, public expenditures are diverted into projects which are not 
enduring, and are therefore wasteful.” 
G. B. L. ARNER 


Demographie, by Lucien March. Paris: J.B. Bailliere & Fils. 1929. 228 pp 


This text by the distinguished member of the faculty of the Institute of Sta- 
tistics in the University of Paris is the twelfth volume of a series of publications on 
hygiene edited by MM. Louis Martin and Georges Brouardel. It is also the 
latest of a long series of contributions to vital statistics in French which began with 
Messance’s renowned work in 1766. It is not generally known that, on the di- 
dactic side, the vital statistics and insurance literature of France for the past 200 
years is perhaps the best in the world. It is of unique philosophic character 
based upon a strong substratum of economic and social considerations. French 
vital statistics can hardly be understood to the fullest extent unless one reads the 
brilliant contributions of Buffon, Moheau, Messance, Neckar, Fourier, Duséjour, 
Condorcet and Laplace. Professor March discusses first the genera! results of 
mortality statistics, distinguishing the subcharacters of sex, age, civil condition 
and season of the year. He takes up the more technical aspects of the treatment 
of population statistics, the construction of life or survival tables upon several 
different hypotheses. He then discusses some of the more technical considera- 
tions in the study of infant mortality, presenting graphic and analytical pro- 
cedures for the preparation of the facts of infant mortality in the first year of life 
by month of age. Then he presents an international review of infant mortality 
statistics over the past century. 

The French life table for the years 1908 to 1913 is then displayed. Farther 
on Professor March discusses some of the technical methods in the compilation of 
statistics of causes of death; statistics of mortality according to occupation; and of 
sickness, presenting here some rare historical data on the sickness experience of 8 
number of European insurance institutions. The second part of the treatise 
deals with birth statistics and their treatment, the problem of stillbirths, legiti- 
macy, the sex ratio at birth, births in relation to age of mother, the seasonal 
variation in the birth-rate, multiple births, and births in relation to occupation of 
parents. The third section of the text treats of methods and results inter- 
nationally for marriage statistics, identifying the factors of age at marriage, 
seasonal variation of the marriage rate and marriage according to occupation. 
The fourth section of the text discusses divorce statistics, and presents conven- 
ient comparisons of international data not always available in the English texts on 
the subject. 

The sixth and closing section deals with the fertility of marriage, and here there 








of 


d 


Ls kena, ee SF 2h hUCUcneelUeelC OlUlCitCUCe!lC COD 








the 
ton 
ew 
has 


not 


li- 


werl OF lOO 





Reviews 453 





105] 


are both data and a statement of methods which should be of interest to American 
statisticians. During the past decade a number of American workers have been 
interested in securing basic data from the censuses of population, from birth and 
mortality statistics, which would throw some light on the fertility of marriage in 
the United States. The lack of such statistics in this country may account for 
the backward condition of the insurance business in respect to coverage for family 
contingencies (known as “‘contingency insurance” in Great Britain, birth, issue 
and maternity insurance; “whole family” covers, etc.). Professor March’s book 
would serve as an excellent auxiliary text for the courses in vital statistics which 
are today presented in our schools of public health. It ranks with the works 
of Westergaard and of Von Mayr as a comprehensive text stressing chiefly the 
descriptive and international aspects of vita! statistics. 

It would be of value to teachers of vital statistics and insurance in the United 
States if some one were to review the literature in French and prepare a text 
therefrom. Very little is known about French vital statistics or insurance in the 
United States. Our students of insurance, public health and economics really 
lose a great deal because of this lack of acquaintance with the works of the great 
French scholars. Have we heretofore given our students eau sucrée? 


E. W. Korr 


A History of Prices and of the State of the Circulation from 1792 to 1856, by 
Thomas Tooke and William Newmarch. Reproduced from the Original, 
with an Introduction by T.E.Gregory. New York:AdelphiCompany. 1928. 
Six volumesinfour. 3422 pp. 

Economic theories concerned with the relation between money and prices, and 
with allied questions of currency and credit control, have been and doubtless will 
remain perennial sources of controversy among economists. At certain periods 
the issues involved have been pressing and immediate, and the controversies have 
been waged with exceptional ardor. Weare livinginsuchatime. The England 
of a century ago was passing through just such another period. The questions 
then at issue differed in certain respects from those debated today, but there are 
fundamental points of resemblance. Out of the controversies of the earlier period 
there emerged a rich literature, a literature which gave form and substance to 
classical economics and, in so doing, left a deep impress on the economic thought 
of the entire world. In considerable part it was a literature of tracts and pam- 
phlets, of controversial documents aiming to support or destroy specific theses. 
But this pamphleteering and controversial writing inspired certain works which 
will remain classics as long as economic science endures. Among these is Tooke 
and Newmarch’s History of Prices and of the State of the Circulation. 

The scale and scope of the History, which appears now in four massive volumes, 
are awe-inspiring. Numerous pamphlets, digests of evidence before special com- 
mittees and lengthy statistical compilations were brought together by the authors 
in preparing their work, and the product has some of the defects to be expected 
from such a genesis. With the present reprint, however, is bound a lengthy Jn- 




























































454 American Statistical Association [106 


troduction by Professor T. E. Gregory, an Introduction which illuminates the con- 
ditions under which the History was written, and which presents a critical ap- 
praisal of the contributions of the authors. Professor Gregory’s essay greatly 
enhances the value of the History for the modern reader, who will find it of mate- 
rial aid in threading his way through the forests of statistics and the labyrinths of 
argument which these volumes include. 

These statistics and arguments do not lend themselves to condensation in any 
such brief review as this. The student of money and prices will sample this va- 
ried fare for himself. Two aspects of the work, however, aspects which reveal 
the authors in a peculiarly interesting relation to modern thought, may profita- 
bly be emphasized. 

Ricardo’s hand has lain heavily upon currency thought for several generations, 
and it is Ricardo’s policies, not those of his opponents in the great controversies of 
that day, which have found expression in legislation. But today the Ricardian 
principles of currency control and the Ricardian explanations of the relations 
among monetary, price and trade fluctuations are again finding opposition, and 
alternative principles and explanations are being urged. As Gregory points out, 
there is a singularly close resemblance between the views of Tooke, concerning 
the ultimate causes of price changes, and the views of many modern opponents of 
a rigid quantity theory. More than an antiquarian interest attaches to some of 
the arguments in these pages. 

In their study of price fluctuations Tooke and Newmarch made no use of index 
numbers. This is a serious imperfection in their work, for the significance of 
their data would have been clearer had this averaging device been employed. 
Yet their procedure had its merits, for it concentrated attention upon individual 
prices and upon the numerous forces affecting these prices. Their analysis of 
price fluctuations consists, in large part, of a detailed study of the seasonal and 
other particular forces assumed to have been chiefly responsible for the price 
changes of the Napoleonic and post-Napoleonic periods. Since Tooke and Nevw- 
march there has been, perhaps, an excessive preoccupation with index numbers of 
prices and with those average price movements which are measured by such in- 
dexes. There is room for just such detailed study of particular forces, and of the 
diversities of price movements due to the action of such forces, as are described in 
the History. 

These weighty volumes have not the swift swing of modern diction, nor is the 
story quite as direct as might be desired. But, with the new Introduction as 
guide, the unhurried and curious reader will find an exploration of these pages 
both diverting and profitable. 





FREDERICK C. MILLS 


Marketing Investigations, by William J. Reilly. New York: The Ronald Press 
Co. 1929. vii, 245 pp. 
This is an excellent discussion of methods of market research, particularly of 
methods of gathering information. The chapters are built about the “four main 

















107] Reviews 455 


steps which one follows in conducting a market investigation”: Planning the 
investigation (three chapters), gathering information (twelve chapters), deter- 
mining the reliability of the information and interpretation (six chapters), 
presentation of the data (one chapter). The book is based upon the author’s 
own experience. It is a sound presentation, avoids the tendency to mere enu- 
meration of problems to be investigated so common to books on market research, 
and is the most helpful treatise in the field. 
Frep E. CLark 
Northwestern University 











American Statistical Association 


RECENT LITERATURE 


Due to the appearance in March of the new journal 


SOCIAL SCIENCE ABSTRACTS 


which is designed to cover fully and for the several social 
sciences the field dealt with superficially in the Recent 
Literature section of our JouRNAL, this section has been dis- 
continued. 

Social Science Abstracts is published monthly and will 
afford thorough abstracts of new material appearing in period- 
icals, government publications, bulletins, reviews, etc. It 
covers foreign as well as local sources in the fields of cultural 
anthropology, economics, history, human geography, political 
science, sociology and statistics. 

For further information the reader is referred to our Jour- 
NAL, June, 1928, Vol. 23, No. 162, p. 187, and December, 
1928, Vol. 23, No. 164, p. 448, and to 


THE Epitor, Social Science Abstracts, 
611 Fayerweather Hall, Columbia University, 
New York City. 


















American Statistical Association 

















——— Just Published 
Wage Incentive Methods 


Selection 
Installation 
Operation 


By C. W. Lytle, M. E. 


Director of Industrial Codperation 
New York University 








Wc incentive methods whenever properly installed and operated have never failed 
to reduce production costs substantially. Executives are coming more and more to 
study departments and even special lines of work to select the incentive plan that is best 
suited to their particular conditions. Frequently this brings about the need of modifying 
present plans, or of installing an altogether different plan, and, sometimes, even putting 
in several kinds of plans side by side. 

There are many incentive plans — obviously no one of them is best suited for al! 
circumstances. But such sweeping and conflicting claims are often made by advocates of 
particular plans that heretofore it has required long and laborious preliminary investiga- 
tion before you could feel safe in deciding just which plan or plans to adopt as the most 
suitable for your particular conditions. 


Now: for the first time, an impartial comparative study of every wage incentive plan 
4 used in American industry is made available in Mr. Lytle’s new book. In one com- 
plete manual, twenty-five basic wage incentive plans — together with their numerous 
variations and modifications ——- are described and analyzed in comprehensive detail. 
The material is presented so that every plan is comparable with every other plan. 
Strong and weak features are given without bias. Tables and charts illustrate the earn- 
ng-performance variations and also the performance-cost variations. The immense 
practical value of such a comparative presentation is self-evident 
The application of the plans described and their variations covers the complete sweep 
f industrial operation. They are applied to all kinds of direct labor, indirect labor, labor 
working under special conditions such as employee introduction to new work, repair 
and maintenance, safety, factory transportation, storeroom operation, inspection, fore- 
manship, regularity, quality, economy, sales, office work and supervision. Group 
ipplications and the point system, both relatively new, are thoroughly covered. 
Mr. Lytle presents the very latest American practice. His book is crammed with 
lustrative data and specific examples of the practical operation of the methods described, 
drawn from more than 100 representative American companies. 


T# book is thoroughly practical throughout Mr. Lytle has in mind the needs of 

the executive who wishes to install the most suitable form of incentive plan. The 
xamples represent every variety of working conditions in small, medium, and large 
lants. This will aid you in finding sets of conditions similar to those in your own organ- 
vation and thus in judging which plans are best suited to your own needs. 

This is a book for those who must take care of the countless details of installing a 
litable wage system and getting it into successful operation as well as those mainly 
mcerned with questions of wage policy. Financial and accounting executives will value 
t for the information it supplies as a check on the results shown by their organization's 
vage payment methods compared with the results other firms are getting. 


457 Pages 78 Charts 71 Tables Price $7.50 


Sent Postpaid for Examination — No Advance Payment Required 





The Ronald Press Company, Publishers 
Dept. M260, 15 East 26th Street, New York, N. Y. 









































American Statistical Association 


ANNOUNCEMENT concernine tue 


APPOINTMENT OF RESEARCH ASSOCIATES BY THE 
NATIONAL BUREAU OF ECONOMIC RESEARCH 





The Directors of the National Bureau of Economic Research will appoint three 
Research Associates for the academic year 1930-31. The purpose of these 
appointments is to provide mature workers with facilities for the conduct of 
quantitative research in economics. They are not intended to aid persons 
working for higher degrees. 


Each Research Associate will receive a stipend of $3,600 per year, plus the ex- 
penses of the round trip between his home and New York. Research Associates 
will be in residence at New York during eleven months of the year beginning 
September 15, 1930. 


It is desirable that candidates for appointment as Research Associates have 
definite research projects under way at the date of application, and that these 
projects should have reached such a stage that completion within a period of one 
year may ordinarily be expected. Research projects proposed by candidates 
may fall in fields now cultivated by members of the staff of the National Bureau, 
or may relate to subjects not hitherto covered in the work of the National 
Bureau. It is assumed, however, that the work of Research Associates will 
deal primarily with the quantitative aspects of economic problems. 


Publication rights to the results of studies conducted by Research Associates 
will be reserved to the National Bureau of Economic Research. 


Appointment of Research Associates will be made upon recommendation of a 
Committee of six, five to be chosen from the group of university representatives 
on the Board of Directors of the National Bureau, the sixth to be appointed by 
the Social Science Research Council. 


Applications for appointment should be submitted to the Directors of the 
National Bureau of Economic Research, 51 Madison Avenue, New York City, 
not later than February 1, 1930. Each application should be accompanied by 
a summary statement describing the research project on which the applicant 
proposes to work. This statement should indicate the status of the project at 
the time the request is submitted. It may be accompanied by manuscript or 
other material. Since the National Bureau cannot assume responsibility for 
the safe return of such material, duplicate copies should be retained by the 
applicant. 


Application forms will be forwarded upon request. 


EDWIN F. GAY 

WESLEY C. MITCHELL 
National Bureau of Directors of Research 
Economic Research, 


51 Madison Avenue 
New York City 








